# Compare two strings with some restrictions

8 views (last 30 days)
vicente Noguer on 15 Sep 2021
Commented: vicente Noguer on 23 Sep 2021
Hey, how are you?
I have to compare to strings of n and m lines each other to see if they have the same messages. The messages are the following way:
!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053
!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053
!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053
!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053
!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054
!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054
As you can see the last four numbers change from 0000 to 5959 the first two are minutes and the other two seconds. I have the code to compare all the messages from one script to another but now I have to compare just the messages that have and ending in a range that we put. Exemple:
!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059
This message ends with 0059 I should compare it with all the messages that end from the number 0000 and 0159. That makes a comparison with the numbers that are one minut above and up the message.
vicente Noguer on 15 Sep 2021
Okay one string is this one:
!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053
!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053
!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053
!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053
!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054
and the other string is:
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
so the output is another string that has the messages taht are in both strings

Walter Roberson on 15 Sep 2021
The result would have been a cell array of character vectors. You can str2double() to get a set of decimal numbers.
Once you have the set of decimal numbers, referred to below as DN, then
dur = minutes(floor(DN/100)) + seconds(mod(DN,100));
If you do that for both sets of data, getting dur1 and dur2, then
[~, M1, S1] = hms(dur1);
[~, M2, S2] = hms(dur2);
[has_match0, idx0] = ismember(M1, M2);
[has_match1, idx1] = ismember(M1+1, M2);
M1_has_match = has_match0 | has_match1;
M1_match(has_match1) = idx1(has_match1);
M1_match(has_match0) = idx0(has_match0);
M1_matches = find(M1_has_match);
M2_matches = M1_match(M1_has_match);
If I got everything right, then M1_matches will be the index into the first set of durations in which there are matches, and M2_matches will be the corresponding indexes into the second set of durations that match the first set.
Any one entry in the first set of durations is only looked for once in the second set of durations, but because of the matching process, any given entry in the second set of durations could match more than one entry in the first set of durations. You did not ask for the closest match that occurs within a particular time interval: you asked for matches that occur if the second set has any entry that has the same minute as one in the first set, or is the next minute after one in the first set.
vicente Noguer on 23 Sep 2021
And the other one is so much clear to me

chrisw23 on 22 Sep 2021
strEx = "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053 !AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053";
% check/modify the expression under https://regex101.com/
tbl = struct2table(regexp(strEx,exp,'names'))
This is just an example how to parse text by a simple grouped regular expression. I use the website described to write and test expressions. The table allows easy access for further processing (ie. datetime conversion) as previously shown. Look at string based compare methods like 'contains' or 'matches' , i.e. tbl.strLoad.contains("137JlD52h0P9td") -> results in logical index to access matches
Hope it helps
Christian
##### 2 CommentsShowHide 1 older comment
chrisw23 on 23 Sep 2021
Ur right "This is just an example... " and no 'best code' competition