Matlab readmatrix inconsistently reading csv files

Question

Christian Taylor le 24 Août 2023

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2012302-matlab-readmatrix-inconsistently-reading-csv-files

Modifié(e) : Stephen23 le 24 Août 2023

I'm using matlabs readmatrix function to read in data from a csv file and store to a variable. The csv files are identical in format, with a bunch of lines of text at the start before the data starts at line 21. However, the readmatrix function seems to behave inconsistently, sometimes capturing all the text at the start of the csv and storing as NaN, and other times ignoring these first 21 lines and only grabbing the data. Why is this? What is a better way to do this?

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Christian Taylor le 24 Août 2023

Update: I have just opened my csv files in a text editor. Whilst the headers look identical in Excel, in the text editor there are a number of comma delimiters after most lines on one of the files. Perhaps this explains the different behaviour.

Stephen23 le 24 Août 2023

Modifié(e) : Stephen23 le 24 Août 2023

"I have just opened my csv files in a text editor. Whilst the headers look identical in Excel, in the text editor there are a number of comma delimiters after most lines on one of the files. Perhaps this explains the different behaviour."

Yes, differences between the files is most likely the cause.

Of course the algorithm used by READTABLE et al is not perfect (there is no such thing) and it cannot read minds: what is obevious to a human is not obvious to a machine. It is always possible to trick or confuse an algorithm with the right combination of data or whatever, such things are mathematically unavoidable.

Note that relying on what files "look like" in MS Excel is a number one mistake that you should avoid: MS Excel mangles data in all sorts of horrible ways that look indistinguishable from inside Excel, e.g. adding or changing dlimiters. It can also change data without any warning:

https://www.theverge.com/2020/8/6/21355674/human-genes-rename-microsoft-excel-misreading-dates

If you want reliable data processing do NOT open and save text files using MS Excel. It is a great tool for Excel spreadsheets... but for anything else... beware of dragons!

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Steven Lord le 24 Août 2023

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2012302-matlab-readmatrix-inconsistently-reading-csv-files#answer_1293137

If you know exactly how many header lines your file contains, I would specify the NumHeaderLines name-value argument in your readmatrix call.

Alternately you can create a file import options object using detectImportOptions. Once it's been created check that its properties that specify where the data is located (either DataRange or DataLines) and where any variable metadata is located (VariableNamesLine, VariableDescriptionsLine, VariableUnitsLine, or the corresponding Range properties for SpreadsheetImportOptions) match your expectations for where the data / metadata is located based on the expected format of the files. Once you've confirmed that they match your expectations, pass that import options object into readmatrix as the opts input argument.

If the import options properties don't match what you expect, and reviewing the file doesn't indicate to you why MATLAB is detecting the values for those properties that it is, please send a sample data file that demonstrates this behavior to Technical Support using this link along with the import options object and describe the results you expect. It's possible that you've identified a bug or an ambiguous edge case in the import options detection algorithm.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Matlab readmatrix inconsistently reading csv files

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

Matlab readmatrix inconsistently reading csv files

7 commentaires Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens