unexpected writetable and readtable behavior

8 vues (au cours des 30 derniers jours)
Kathryn Lund
Kathryn Lund le 11 Mai 2023
I have a table like
T = cell2table({[1 2 3], 2});
I then make some changes and save it to a CSV file:
writetable(T, 'test')
But later I need to read it again:
T2 = readtable('test')
T2 = 1×4 table
Var1_1 Var1_2 Var1_3 Var2 ______ ______ ______ ____ 1 2 3 2
And I discover that faux column headers have been introduced:
disp(T2)
Var1_1 Var1_2 Var1_3 Var2 ______ ______ ______ ____ 1 2 3 2
This seems to be something writetable is hard-wired to do.
Alternatively, I could load something that looks like T directly from a CSV, and use a ";" delimiter to be safe:
T3 = readtable('test2');
where test2 is
Var1;Var2
[1,2,3];2
From here I get close to what I want: the vector stored in a cell (albeit as a char, but str2num can help here).
T3 = 1×2 table
Var1 Var2
_______ ____
{'[1 2 3]'} 2
My question is: why doesn't writetable store T as in test2? Even when I force writetable to use a different delimiter, it still creates extra columns. It seems the only way around this is to convert all vectors to a string manually.

Réponses (2)

Steven Lord
Steven Lord le 11 Mai 2023
This is the documented behavior. If you scroll down to the Algorithms section on the writetable documentation page, one of the entries states:
"There are some instances where the writetable function creates a file that does not represent T exactly. You will notice this when you use readtable to read that file. The resulting table might not have the same format or contents as the original table. If you need to save a table and retrieve it at a later time to match the original table exactly, with the same data and organization, then save it as a MAT-file. writetable writes an inexact table in the following instances:"
One of the 'following instances' specified is "For variables that have more than one column, writetable appends a unique identifier to the variable name to use as the column headings."
I am not certain of the reason behind this difference in the table in MATLAB and the table written to the file by writetable, but I suspect that it might cause problems when writing to a spreadsheet file. I'm not sure how the variable Var1 in your original table would be represented if you were writing to a Microsoft Excel spreadsheet file, for example, and this may be a way to obtain more consistent behavior across the different file formats (text and spreadsheet.)

Jeremy Hughes
Jeremy Hughes le 11 Mai 2023
I think part of this is that the table looks like this:
T1 = cell2table({[1 2 3], 2})
T1 = 1×2 table
Var1 Var2 ___________ ____ 1 2 3 2
Where most tables look like:
T2 = table([1; 2; 3], [2;2;2])
T2 = 3×2 table
Var1 Var2 ____ ____ 1 2 2 2 3 2
In spreadsheet or CSV it's not possible to represent a multi-column variable, so the function adds headers (VariableNames) to the output.
The output of writing the first table as CSV is:
writetable(T1, "test1.csv")
type("test1.csv")
Var1_1,Var1_2,Var1_3,Var2 1,2,3,2
And that's what's read back in you get exactly what you see.
Whereas the second table would become:
writetable(T2, "test2.csv")
type("test2.csv")
Var1,Var2 1,2 2,2 3,2
You can ask writetable not to write the VariableNames, using
writetable(T1, "test1.csv", WriteVariableNames=false)
type("test1.csv")
1,2,3,2
or
writetable(T2, "test2.csv", WriteVariableNames=false)
type("test2.csv")
1,2 2,2 3,2
You can see the data is the same, it's just expanding the mult-column values into their own columns, and readtable doesn't have any way of putting them together again.

Catégories

En savoir plus sur Data Import and Analysis dans Help Center et File Exchange

Tags

Produits


Version

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by