How to find an exact string match in a list of folder names

32 views (last 30 days)
Hello erveryone,
I have a problem trying to extract data from a sequence of files, based on exact string names contained within the subfolder names. The problem I have is extracting data to Y_NODE and XY_NODE because contain cannot differentiate between 'Y_High' and 'XY_High' and is extracting all the data into Y_High variable. I have tried contains, matches, strcmp, strfind etc but I cannot get it to match correctly and assign the data to the correct cell array.
I cannot attached the raw data because it is too large, but the list of folder names is attached.
Could someone help please?
pattern = ["No_High1_add_on","X_High1_add_on","Y_High1_add_on","XY_High1_add_on"];
for k = 1:numberOfFolders
% Get this folder and print it out.
thisFolder = listOfFolderNames{k};
if contains(thisFolder,pattern(1))
J = 1;
elseif contains(thisFolder,pattern(2))
J = 2;
elseif contains(thisFolder,pattern(3))
J = 3;
elseif contains(thisFolder,pattern(4))
J = 4;
filePattern = sprintf('%s/*node.csv', thisFolder);
baseFileNames = dir(filePattern);
numberOfImageFiles = length(baseFileNames);
if numberOfImageFiles >= 1
% Go through all those files.
for f = 1 : numberOfImageFiles
fullFileName = fullfile(thisFolder, baseFileNames(f).name);
if J == 1
NO_NODE{k} = importdata(fullFileName);
elseif J == 2
X_NODE{k} = importdata(fullFileName);
elseif J == 3
Y_NODE{k} = importdata(fullFileName);
elseif J == 4
XY_NODE{k} = importdata(fullFileName);
fprintf(' Folder %s has no files in it.\n', thisFolder);

Accepted Answer

Guillaume on 16 Mar 2020
Edited: Guillaume on 16 Mar 2020
It's simple to solve: rather than testing first for 'X' then 'Y' then 'XY', test first for 'XY' then 'X' or 'Y' then the other. If the first test pass, then it's guaranteed to be 'XY'.
Note that a bunch of if...elseif... that all do the same thing is usually a bad design. It's not easy to extend to many more patterns. If you had 30 different patterns, would you write 30 different tests. A loop would make the code much simpler:
pattern = ["No_High1_add_on", "XY_High1_add_on", "X_High1_add_on", "Y_High1_add_on"]; %XY pattern MUST precede X and Y pattern since it is a superset
for k = 1:numel(listOfFolderNames)
% Get this folder and print it out.
thisFolder = listOfFolderNames{k};
matchedpattern = 0
for patternindex = 1:numel(pattern)
if contains(thisFolder, pattern(patternindex))
matchedpattern = patternindex;
if matchedpattern == 0, continue, end %no match found
Similarly later on I would not use different named variables to store the data. The design is very likely to end up forcing you to copy a bunch of time each time you want to process each variable, when again a loop would avoid the repetition. I would store the imported file in a cell array of cell arrays:
pattern = ["No_High1_add_on", "XY_High1_add_on", "X_High1_add_on", "Y_High1_add_on"]; %XY pattern MUST precede X and Y pattern since it is a superset
patterndata = cell(size(pattern)); %cell array to store the imported files for each pattern
for k = 1:numel(listOfFolderNames)
for f = 1 : numberOfImageFiles
fullFileName = fullfile(thisFolder, baseFileNames(f).name);
patterndata{matchedpattern}{end+1} = importdata(fullFileName); %#ok<AGROW> Number of files in each category is unknown so have no choice but to grow the array
Note that unlike your original code, the above does not leave empty cells in each cell array. (On a given k your original code only filled one of the NO_NODE, X_NODE, etc. cell array leaving the others with an empty k cell.
  1 Comment
Richard Rees
Richard Rees on 16 Mar 2020
Hi, that is very nice and thank you for the explainations aswell, they will be taken onboard.

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 15 Mar 2020
strcmp() should work. I'd like to see code where it doesn't. contains() won't work - it will operate as you said since 'Y_High' is contained inside 'XY_High'. But I really think strcmp() should.
At first I thought maybe it's because you're comparing strings to character arrays. Your pattern is a string array, not a cell array of character arrays like listOfFolderNames probably is. Strings and character arrays are now different types of variables in MATLAB, as of a few versions ago. But when I did a test, it shows this is not the case and they still match despite being of different variable types:
s1 = "abc" % A string
s2 = 'abc' % A character vector
e1 = isequal(s1, s2)
e2 = strcmp(s1, s2)
e3 = contains(s1, s2)
e1, e2, and e3 all show as true.
Image Analyst
Image Analyst on 16 Mar 2020
Does this work:
locations = strfind(A, Pattern)
It tells you what index Pattern starts at in A.

Sign in to comment.


Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by