How to replace all 1x1 cells containing 'NAN' with 'NaN'?

6 vues (au cours des 30 derniers jours)
Minh Tran
Minh Tran le 7 Août 2019
Commenté : dpb le 12 Août 2019
Attached is a cell array that I need to convert to an array of doubles (for plotting). The values were pulled from textfile (into a table) where Not-A-Number values were represented by the string "NAN". This isn't prefered since matlab's str2num function only recognizes 'NaN' and 'nan' and will only convert those two specific string values to NaN value.
I've tried a couple of things and was surprised this didn't work:
find(strcmp(bin1, 'NAN')); % 0×1 empty double column vector
find(strcmp(bin1, '"NAN"')); % 0×1 empty double column vector
So the tasks at hand are:
  1. Find all cell arrays containing the string 'NAN' and substitute it with 'NaN' or 'nan'
  2. Convert the cell array of 'NaN' strings and double into an array of doubles.
________________
Unrelated but this was the call I'd used to convert the table of strings to a cell array
bin1 = cellfun(@str2num, table2cell(table.WH.bin1.speed), 'UniformOutput', false);
And this was how I got a table of string values from the textfile:
table = readtable(filename, delimitedTextImportOptions('Delimiter', ','));

Réponse acceptée

dpb
dpb le 7 Août 2019
Modifié(e) : dpb le 9 Août 2019
opt=detectImportOptions('Equinor.txt');
t=readtable('Equinor.txt',opt);
>> t.TIMESTAMP=datetime(t.TIMESTAMP,'InputFormat','uuuu-MM-dd HH:mm:ss')
t =
13×22 table
TIMESTAMP RECORD BinNum Depth EastComp NorthComp Speed Direction VerticalVel ErrorVel Corr_1 Corr_2 Corr_3 Corr_4 Echo_1 Echo_2 Echo_3 Echo_4 Pgp_1 Pgp_2 Pgp_3 Pgp_4
____________________ ______ ______ _____ ________ _________ _____ _________ ___________ ________ ______ ______ ______ ______ ______ ______ ______ ______ _____ _____ _____ _____
08-Jun-2018 14:09:58 806 1 -8.8 NaN NaN NaN NaN NaN NaN 6 6 6 5 33 34 39 40 0 0 100 0
08-Jun-2018 14:19:58 807 1 -8.8 NaN NaN NaN NaN NaN NaN 6 6 6 6 33 34 39 41 0 0 100 0
08-Jun-2018 14:29:58 808 1 -8.8 NaN NaN NaN NaN NaN NaN 5 6 6 6 33 34 39 40 0 0 100 0
08-Jun-2018 14:39:58 809 1 -8.8 NaN NaN NaN NaN NaN NaN 6 6 6 6 33 34 39 41 0 0 100 0
03-Oct-2018 09:29:54 1498 1 -9.1 NaN NaN NaN NaN NaN NaN 6 5 6 6 38 38 42 40 0 0 100 0
03-Oct-2018 10:29:54 1504 1 -9.1 NaN NaN NaN NaN NaN NaN 6 6 6 5 38 38 42 40 0 0 100 0
03-Oct-2018 11:29:54 1510 1 15.4 -25.7 8.4 27 288.1 -3.5 2.9 125 125 120 125 107 106 118 108 0 0 3 96
03-Oct-2018 12:09:54 1514 1 15.4 -33.1 15.7 36.6 295.4 1.5 5.1 126 126 123 126 106 105 118 106 0 0 0 99
03-Oct-2018 12:19:54 1515 1 15.4 -34.1 15.9 37.6 295 0.7 6.3 126 127 122 126 104 104 115 106 0 0 0 98
03-Oct-2018 12:29:54 1516 1 15.4 -36.6 10.8 38.2 286.4 1.2 7.1 127 126 121 127 104 103 117 105 0 0 0 98
03-Oct-2018 12:39:54 1517 1 15.4 -38.3 6.3 38.8 279.3 -1.3 6.6 127 126 122 127 104 103 117 104 2 0 0 98
03-Oct-2018 12:49:54 1518 1 15.4 -37.6 3.5 37.8 275.3 0.7 5.9 126 127 123 127 103 102 116 105 2 0 0 97
03-Oct-2018 12:59:54 1519 1 15.4 -41.3 2.8 41.4 273.9 -1.3 5.6 126 126 121 125 104 102 118 104 4 0 3 92
>>
NB: I trimmed a big chunk of the "NAN" section out of the posted file...
  12 commentaires
Minh Tran
Minh Tran le 12 Août 2019
"The conclusion is for such a file you'll have to ensure the correct format for each column if you can't guarantee the quoted string...and, since the testing here is pretty minimal so is possible other conditions could cause the symptom, it's basically required to make sure the VariableTypes option is set correctly when there's any doubt about how a particular column could be interpreted to ensure one reads it as expected."
dpb, thank you for digging into this. I'll play around with those test cases you mentioned and the VariableTypes property to get a better feel for the parser.
Since the ML function bin2dec does expect string input, then forcing that column to be 'char' would be the correct choice to override the default interpretation as decimal input.
You may need to develop a library of these import objects to fit the set of files you routinely deal with, but having done so your subsequent coding should become much cleaner.
That is a fine strategy.
dpb
dpb le 12 Août 2019
Remember the options object is just a structure so you can simply SAVE the variable once created and debugged. You can either name variables to be identifiable and load specific ones based on the input file by name (or via an array or struct via field names) or use just one variable name and save multiple files to load...whatever seems to best fit with the rest of your application.

Connectez-vous pour commenter.

Plus de réponses (1)

dpb
dpb le 7 Août 2019
>> load bin1
>> whos bin1
Name Size Bytes Class Attributes
bin1 8396x1 1122684 cell
>> bin1{1}
ans =
"NAN"
>> whos ans
Name Size Bytes Class Attributes
ans 1x1 134 string
>> dat=str2double(bin1);
>> whos dat
Name Size Bytes Class Attributes
dat 8396x1 67168 double
>> dat(1)
ans =
NaN
>>
"More better" would be to attach a short portion of the text file and let's read it correctly first instead of having to fix up a mess later...
  1 commentaire
Minh Tran
Minh Tran le 7 Août 2019
Modifié(e) : Minh Tran le 7 Août 2019
I included an additional attachment: Equinor_EngData_WHBB_Bin1_partial.txt
str2double(bin1) converted all values to NaN.
The data become non-NaN on row bin1{855}.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Data Type Conversion dans Help Center et File Exchange

Produits


Version

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by