Parsing Strings with Values Missing
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi everyone!
I am currently working on a code that will allow me to extract the elevation of multiple GPS's from a string of data. However, each line of data will only contain information about 4 (or less) GPS's before continuing on a new line. This means the last line often doesn't have the same amount of data as the first lines. I tried working around this by creating an if-else statement. Sadly, this doesn't work as Matlab when parsing the data does not recognize two consecutive commas as a value missing and doesn't count it. This means I will get the wrong values into my matrix. I don't know how to overcome this? I have copied a couple lines of my data below as well as my code. The code is over 800 lines in total so it's just a small excerpt of the entire code.
A quick explanation of the data - I am looking to extract the 2 digit number just before the 3 digit number. That's the elevation of the GPS's in the sky in degrees. I need both GPGSV and GLGSV. The first number is the amount of lines for the particular GPS reading. The second number is the actual line number - so the first line is line 1 of 3 and so on. The 3rd number is the number of satellites. The 4th number is irrelevant in my data collection.
Thank you very much in advance!
----------------------------------DATA------------------------------------
$GPGSV,3,1,12,01,09,252,27,03,46,296,47,04,02,227,20,14,27,103,46*7C
$GPGSV,3,2,12,16,25,184,26,22,02,159,32,23,19,300,48,25,19,041,40*74
$GPGSV,3,3,12,26,52,161,50,29,09,079,43,31,65,038,50,48,23,236,36*71
$GLGSV,3,1,09,67,08,149,,67,24,150,30,68,80,173,43,78,72,003,40*62
$GLGSV,3,2,09,70,10,333,,86,03,009,28,77,20,039,34,69,42,324,38*6E
$GLGSV,3,3,09,87,02,059,,,,,,,,,,,,,*5D
----------------------------------DATA------------------------------------
----------------------------------CODE------------------------------------
%GSV data
GSVcheck = strfind(AllData{1}, 'GSV');
GSVrows = find(~cellfun('isempty',GSVcheck));
GSVdata = AllData{1}(GSVrows);
GSVlength = floor(length(GSVdata)/6);
%'Empty' matrices
GSV = cell(DistanceLength*6,1);
%Parse $GSV
parseGSVdata = strsplit(GSVdata{counter},',');
numLines = parseGSVdata{2};
lineNum = parseGSVdata{3};
if lineNum ~= numLines
GSV{counter,1} = parseGSVdata{6};
GSV{counter,2} = parseGSVdata{10};
GSV{counter,3} = parseGSVdata{14};
GSV{counter,4} = parseGSVdata{18};
elseif lineNum == numLines
dataLeft = parseGSVdata{4};
dataAmount = numLines*4 - dataLeft;
if dataAmount == 1
GSV{counter,1} = parseGSVdata{6};
elseif dataAmount == 2
GSV{counter,1} = parseGSVdata{6};
GSV{counter,2} = parseGSVdata{10};
elseif dataAmount == 3
GSV{counter,1} = parseGSVdata{6};
GSV{counter,2} = parseGSVdata{10};
GSV{counter,3} = parseGSVdata{14};
elseif dataAmount == 4
GSV{counter,1} = parseGSVdata{6};
GSV{counter,2} = parseGSVdata{10};
GSV{counter,3} = parseGSVdata{14};
GSV{counter,4} = parseGSVdata{18};
end
end
----------------------------------CODE------------------------------------
0 commentaires
Réponse acceptée
dpb
le 3 Juin 2016
Modifié(e) : dpb
le 4 Juin 2016
Actually, since the values are regularly spaced, simply create a format string for the ones you want...I picked the first and the last record...and put into a string gpg and glg, respectively...
>> fmt=['%*s' repmat('%*f',1,4) repmat(['%f' repmat('%*f',1,3)],1,4) '%*s'];
>> gpval=cell2mat(textscan(gpg,fmt,'delimiter',','))
gpval =
9 46 2 27
>> glval=cell2mat(textscan(glg,fmt,'delimiter',','))
glval =
2 NaN NaN NaN
>>
ADDENDUM
The missing value conundrum is associated with using a '%d' numeric format instead of '%f'; the default value of NaN can't be stored in an integer which is the default class returned. I was unaware of that until some further checking on what was happening...had always presumed everything numeric would be double(*) by default unless specifically cast to something else.
() Although it is, indeed, documented that *textscan returns the output class of int or uint, for us old-timers used to "everything in Matlab is double unless", it takes some getting used to these new-fangled ways. [f|s]scanf, for instance, do not do this but return double...and the old standby around "since forever" precursor to textscan, textread doesn't, either.
>> type int.dat
23,133
>> textread('int.dat','%d','delimiter',',')
ans =
23
133
>> whos ans
Name Size Bytes Class Attributes
ans 2x1 16 double
>>
4 commentaires
dpb
le 6 Juin 2016
Well, the repmat is solely Matlab; there's always the recourse of writing N (in this case, 20) individual format strings but I find it easier to keep track of "who's who in the zoo" if use the symmetry that is in the input record (presuming there is some, of course, which there usually is). C writes the format string as [Width[.Precision]]DataType instead of Fortran FORMAT DataType[Width[.Precision]]. Since there's a numeric value in front of the Type specifier, it makes parsing a form for a repeat multiplier very tough so it isn't implemented; hence you have to write every element explicitly in one form or another. What an unnecessary pain it is, indeed... :(
Anyway, that annoyance aside, glad you got it going; hope something was learned as well as solving the immediate problem.
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur String Parsing dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!