# Group the array with similar strings

8 views (last 30 days)
Zeynab Mousavikhamene on 21 Nov 2019
Answered: Max Murphy on 24 Nov 2019
How can I find the strings that have similar string in them:
A=['x_B1_0_s_0' , 'x_B1_0_s_1', 'x_B1_0_s_2', ' y_B1_2_s_0' , 'y_B1_2_s_1', ' y_B1_2_s_2' ,' x_B2_0_s_0' ,
'x_B2_0_s_1' ,' x_B2_0_s_2' , 'y_B2_2_s_0' , 'y_B2_2_s_1' , 'y_B2_2_s_2' ]
I need to group above array to have 4 groups of strings like this:
B=['x_B1_0', 'y_B1_2', 'x_B2_0', 'y_B2_2']
Any idea how to do so?

Guillaume on 23 Nov 2019
It's very unclear what your definition of similar is.
Also note that your example A is a char vector, not a string, and that concatenating char vectors makes just one big char vector.
Assuming you meant your A to be a string array then one way to obtain your desired result is with:
A = ["x_B1_0_s_0" , "x_B1_0_s_1", "x_B1_0_s_2", "y_B1_2_s_0" , "y_B1_2_s_1", "y_B1_2_s_2" ,"x_B2_0_s_0" , ...
"x_B2_0_s_1" ,"x_B2_0_s_2" , "y_B2_2_s_0" , "y_B2_2_s_1" , "y_B2_2_s_2" ]
B = unique(extractBefore(A, 7))

Max Murphy on 24 Nov 2019
Seems like you are trying to parse out a bunch of variable names and associated values and group them in a table. Ideally, it would work something like this:
% I assume it should be a cell array of char, or an array of strings
A={'x_B1_0_s_0' , 'x_B1_0_s_1', 'x_B1_0_s_2', ... % formatted
'y_B1_2_s_0' , 'y_B1_2_s_1', 'y_B1_2_s_2' , ...
'x_B2_0_s_0' , 'x_B2_0_s_1' , 'x_B2_0_s_2' , ...
'y_B2_2_s_0' , 'y_B2_2_s_1' , 'y_B2_2_s_2' };
% varExp can be changed, but does the parsing using regexp basically
varExp = '(?<label>[xy]_B\d_\d)_(?<data>\w)';
S = dynamicVarParser(A,varExp);
T = struct2table(S);
And here is the helper function I wrote:
function S = dynamicVarParser(A,varExp)
%% DYNAMICVARPARSER Parse cell array of char vectors into struct fields
%
% S = dynamicVarParser(A,nLevels);
%
% e.g.
% S = dynamicVarParser({'x_B1_0_s_0','x_B2_0_s_2'});
%%
if nargin < 2
end
S = regexp(A{1},varExp,'names');
for i = 2:numel(A)
S(i) = regexp(A{i},varExp,'names');
end
end
Note that if you call it without varExp it has a default expression that is different from the one I coded above; that reflects a possible different grouping that may be useful to you based on this question, not sure what your data looks like at the end of the day.