restructure cell array from textscan

6 vues (au cours des 30 derniers jours)
Peter Lobato
Peter Lobato le 3 Août 2018
Modifié(e) : Jan le 5 Août 2018
Hi folks, I'm using textscan to bring in data from a .csv file to turn it into a cell array. The cell array consists of one row with multiple columns, each containing one row with its own {M x N} cell in it. (i.e., each column in my .csv dataset gets compressed into its own cell).
C =
Columns 1 through 3
{5x1 cell} {5x1 cell} {5x1 cell}
Is there a way to instead generate a cell array with a single column and multiple rows, each row containing one row of data from my .csv file? (i.e., each row in my .csv dataset gets compressed into its own cell). Tying to get it to look like this (if its even possible):
C =
{3x1 cell}
{3x1 cell}
{3x1 cell}
{3x1 cell}
{3x1 cell}
Reason being is I'm bringing in a large number of very large .csv files, concatenating them and exporting as a tab-delimited .txt file using fprintf. Due to memory limitations, I can only import one file at a time to write to the output file.
Thanks!
  3 commentaires
Image Analyst
Image Analyst le 3 Août 2018
Why not just simply use csvread() to read the data into a double array? Why hassle with the complications of a cell array???
Peter Lobato
Peter Lobato le 4 Août 2018
Normally using csvread or dlmread would be way easier, but those (I think) can only import numeric data. The problem with the data I have is each cell can randomly have either numbers or text inside them (e.g. if a thermocouple starts to fail, a column will intermittently switch between a number and "INF", also I want to keep the intermittent "INF" so I can see if/when a thermocouple is failing). I figured the easiest way would be to import everything as a cell array so it doesn't matter what is inside each cell.
I attached an example data file. All files have the exact same headers, same number of columns, but variable number of rows (usually around 100000).

Connectez-vous pour commenter.

Réponse acceptée

Jan
Jan le 4 Août 2018
Some testdata:
C = {};
for i1 = 1:3
for i2 = 1:5
C{i1}{i2} = i1*10+i2;
end
end
Now the conversion:
CC = num2cell(cat(1, C{:}), 1)
Maybe you want to transpose CC. But is this really useful?
Reason being is I'm bringing in a large number of very large .csv files,
concatenating them and exporting as a tab-delimited .txt file using fprintf.
It seems to be much easier to import the files as text, use strrep to replace the commas by tabs and append the result as string. The conversion and reshaping of cells is most likely a waste of time.
  3 commentaires
Peter Lobato
Peter Lobato le 5 Août 2018
Think I figured it out from the last thing you said - replacing commas with \t. One thing I didn't see earlier that completely threw me off is in my data file, there is whitespace before every non-number character, such as "INF" or "NAN" (didn't see it in excel, happened to see it in notepad). Evidently, textscan starts a new cell when it encounters whitespace as default. To get around it, I just set the 'whitespace' parameter to something I know isn't anywhere in the file (in this case '|', sort of a cheesy work-around, but it works). Also, got rid of 'Delimiter', so each row of data is a string, including commas. function looked like this:
C = textscan(fid,repmat('%s',1,311),'HeaderLines',6,'Whitespace','|');
Then to print,
fid2 = fopen('RPECS.txt','a+');
for c = 1:length(C{1})
CC = [strrep(char(C{1}(c)),',','\t'),'\r\n'];
fprintf(fid2,CC,'Delimiter','\t');
clear CC
end
Thanks so much for the help!
Jan
Jan le 5 Août 2018
Modifié(e) : Jan le 5 Août 2018
fprintf does not have a "Delimiter" argument.
What about:
Str = fileread(FileName);
Str = strrep(Str, ',', sprintf('\t'));
Str = strrep(Str, ' ', ''); % remove spaces?
fid = fopen(OutputFile);
fwrite(fid, Str, 'char');
fclose(fid);
This replaces all commas by tabs. But is this useful at all? What do you want to achieve actually? Which transformation is wanted? I have the impression that converting the output of textscan is a confusion indirection only.

Connectez-vous pour commenter.

Plus de réponses (1)

jonas
jonas le 4 Août 2018
Modifié(e) : jonas le 4 Août 2018
What about this solution? I've replaced one of the numbers in the first column with a string, just to make sure it works.
%%Read data
opts = detectImportOptions('example_data.csv','NumHeaderLines',6);
opts = setvartype(opts,'double')
T=readtable('example_data.csv',opts);
%%Save 15 columns only
T=T{:,1:15};
%%Save to cell array if you really want
B=mat2cell(T,15,ones(1,15))'
B =
15×1 cell array
[15×1 double]
[15×1 double]
[15×1 double]
...
Strings are stored as NaN, except if the string is Inf, in which case it is stored as Inf. If you, for some reason want strings instead of doubles, then change the argument of setvartype to 'string'.

Catégories

En savoir plus sur Data Type Conversion dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by