Search/Find sub strings in a string

2 vues (au cours des 30 derniers jours)
Philipp Mueller
Philipp Mueller le 9 Fév 2021
Modifié(e) : Stephen23 le 12 Fév 2021
Hello,
I have two cell arrays:
  1. The first cell array is a 1x279 cell array which contains components abbreviations like - Ra, Bt, Rag, Rg, Vm, SzF and so on.
  2. The second cell array is a 1x439 cell array which contains strings like 'F.y_1T.Rg1_N7' or 'F.y_1T.Ra1.SzF2_N590'
You can see the component abbreviations of the first cell array are in the second cell arrays as a substring / part of the whole string .
  1. For example second cell array 'F.y_1T.Rg1_N7' -> Rg -component abbreviation found from the first cell array. (The rest of the string is something else - not important).
  2. Second Example: Second cell array F.y_1T.Ra1.SzF2_N590 -> Ra and SzF are found from the the first cell array
So my target is to know which of the 279 component abbreviations are found in the second array? I tried a lot but with no working solution. I forgot my matlab skills cause i dont need very often.
Here is a small part of my code (not working/ not the whole code)
for t=1:size;%279
for h=1:numCols1%439
locigal(t,h) = strcmp(Messstelle{1,h}, abbrev_components{1,t})
%locigal_1{t,h} = strcmp(Messstelle{1,h}, abbrev_components{1,t})
%locigal{t,h} = strmatch(Messstelle{1,h},abbrev_components{1,t})
end
end
What is the difference between 'F.y_1T.Rg1_N7'`and "F.y_1T.Rg1_N7" both are cell arrays. How can i convert it?
Thank you
  2 commentaires
Stephen23
Stephen23 le 9 Fév 2021
Modifié(e) : Stephen23 le 9 Fév 2021
"What is the difference between 'F.y_1T.Rg1_N7'`and "F.y_1T.Rg1_N7" both are cell arrays."
  • 'F.y_1T.Rg1_N7' is a character vector.
  • "F.y_1T.Rg1_N7" is a string scalar.
If the text contains Rag, do you want to prevent matching Ra ? If so, how?
Philipp Mueller
Philipp Mueller le 9 Fév 2021
Thank you for your answer. rag = rag -> ra is in this case false. I want to prevent matching ra.

Connectez-vous pour commenter.

Réponses (2)

Stephen23
Stephen23 le 9 Fév 2021
Modifié(e) : Stephen23 le 9 Fév 2021
This matches abbreviations followed by one digit (thus avoiding the Ra/Rag matching problem):
C = {'Ra','Bt','Rag','Rg','Vm','SzF'};
D = {'F.y_1T.Rg1_N7','F.y_1T.Rag1.SzF2_N590'}; % Ra changed to Rag !
rgx = sprintf('|%s',C{:});
rgx = sprintf('(%s)%s',rgx(2:end),'(?=\d)');
tmp = regexp(D,rgx,'match');
tmp{:}
ans = 1x1 cell array
{'Rg'}
ans = 1x2 cell array
{'Rag'} {'SzF'}
If there can be other characters trailing the abbreviations then adapt the lookahead assertion as required:
After this you can simply do ismember on each cell of tmp:
boo = cellfun(@(c)ismember(C,c),tmp,'uni',0);
boo = vertcat(boo{:})
boo = 2x6 logical array
0 0 0 1 0 0 0 0 1 0 0 1
  2 commentaires
Philipp Mueller
Philipp Mueller le 12 Fév 2021
Sorry for the late reply. I have mentally reproduced your programming code. But I have one last request: As a result, I would like an array that does not contain any duplicate entries. Certain component abbreviations appear several times often i have seen a 1x2 cell. I just want to have a simple array where are no duplicate entries. I hope you understand what I mean and I want to apologize me for my bad english. Thank you for your active help. Kind regards
Stephen23
Stephen23 le 12 Fév 2021
Modifié(e) : Stephen23 le 12 Fév 2021
"I just want to have a simple array where are no duplicate entries."
Each row of array boo contains true to indicate if an abbreviation occurs one or more times in the corresponding string. It does not contain duplicate true values for any one abbreviation.
Please give an example string with duplicate values, and also the expected output.

Connectez-vous pour commenter.


Jan
Jan le 9 Fév 2021
Modifié(e) : Jan le 9 Fév 2021
KeyList = {'Ra', 'Bt', 'Rag', 'Rg', 'Vm', 'SzF'};
StringList = {'F.y_1T.Rg1_N7', 'F.y_1T.Ra1.SzF2_N590', ...
'F.y_1T.Rag.SzF2_N590'};
nKey = numel(KeyList);
nString = numel(StringList);
L = false(nString, nKey);
for t = 1:nKey
% Mask other keys:
exclude = find(contains(KeyList, KeyList{t}));
exclude(exclude == t) = [];
S = StringList;
for k = exclude
S = strrep(S, KeyList{k}, '*');
end
L(:, t) = contains(S, KeyList{t});
% Matlab < R2016b:
% L(:, t) = ~strcmp(S, strrep(StringList, KeyList{t}, ''));
end
Now "Rag" is masked, if "Ra" is searched.

Catégories

En savoir plus sur Characters and Strings dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by