Split cvs on commas but prevent doing so for a string with a comma in it

26 vues (au cours des 30 derniers jours)
Tycho Maas
Tycho Maas le 13 Déc 2020
Commenté : Cris LaPierre le 14 Déc 2020
My Excel csv file looks like this:
Data,test,04-12-2020 13:11,0,"8,2",1,2,3
Currently I use the following code to seperate the columns:
[~,~,dataCGM] = xlsread('file.csv');
outCGM = regexp(dataCGM, ',', 'split');
outCGM = outCGM(2:end-1);
This does split the columns on commas but also does so for the string "8,2" which is not what I want. Does anyone know how to prevent this issue and keep the value as a string in a single column?

Réponses (2)

Cris LaPierre
Cris LaPierre le 13 Déc 2020
Perhaps one of the options given here is helpful.
  20 commentaires
Stephen23
Stephen23 le 14 Déc 2020
Modifié(e) : Stephen23 le 14 Déc 2020
"Maybe someone like @Stephen Cobeldick, who is a regexp ninja, can improve on this."
Thank you for the unique commendation.
Although it is probably not the fastest approach, I would try importing the entire file as one string, apply some string manipulation to it to remove the line-end quotation marks (e.g. REGEXPREP), and then write a new file which can then be directly imported using READTABLE. That has the benefit of importing all the different data classes correctly without much overhead and all of the standard READTABLE options.
It is not trivial because of course valid quotes around a string should not be removed.
This issue pops up enough to indicate that it would be nice for it to be handled natively:
Perhaps it would be a useful addition for READTABLE et al to include an option named e.g. LINEQUOTE which can be set to the required character (by default empty).
Cris LaPierre
Cris LaPierre le 14 Déc 2020
I can only make it work for what I see.
You can look into what settings are available from detectImportOptions. I suspect the NumHeaderLines is what you are looking for.

Connectez-vous pour commenter.


Walter Roberson
Walter Roberson le 13 Déc 2020
readtable() with a format that is
'%s,%s,%{dd-MM-uuuu HH:mm}D,%f,%q,%f,%f,%f'
  2 commentaires
Tycho Maas
Tycho Maas le 13 Déc 2020
Thanks but the code needs to work on itself without predefining what will be in which column.
Image Analyst
Image Analyst le 13 Déc 2020
That makes no sense. A program will not "work on itself". You need to tell your code HOW to process the file. It won't magically figure it out. Attach your csv file if you need more help.

Connectez-vous pour commenter.

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by