read table handling column of number values as column of strings

29 views (last 30 days)
Hi.
I've a .csv file in the following form:
filepath ,speed ,numbers ,parameter1 ,parameter2
5991e93b-00b5-4f4d-80d0-e6616a990967.mat ,* ,460 ,testing ,fwd
ab8b2c10-8697-4616-b2e7-aea6f0a7cbbb.mat ,50 ,600 ,twocontrollers ,*
8fad610f-677d-4b6b-9e98-2576f65386a7.mat ,99 ,100 ,testing ,*
6f77eeec-f211-42b6-bd64-af05c8a7a06a.mat ,50 ,420 ,* ,aft
821fe78b-5341-4ab7-9d2a-384b201263da.mat ,80 ,430 ,twocontrollers ,aft
6b409b33-6475-4645-9d64-c014edfc871b.mat ,99 ,470 ,testing ,aft
0827444a-a3ed-4357-83e3-91c8f7b8ed6e.mat ,30 ,200 ,twocontrollers ,*
3b991d9e-0687-4136-a5e3-c004df126eaf.mat ,80 ,600 ,arc5 ,aft
bbe06f6a-00d6-43a2-9b6b-72d9d3e050c8.mat ,80 ,450 ,twocontrollers ,aft
9601cc8a-0fda-424d-808b-b80acdc1bd9a.mat ,99 ,430 ,working ,fwd
18db3b45-f1c3-472d-9ace-acc9bb42cfe8.mat ,* ,200 ,fivecontrollers ,*
a34e95e4-5160-4902-a5b0-cc99336fe0f7.mat ,90 ,430 ,working ,low
6ddc43bb-73ae-4d6b-b24e-dae60328acbe.mat ,99 ,100 ,[feed] ,fwd
f10a9460-1332-47a4-85dd-cc19865a0642.mat ,60 ,450 ,* ,fwd
2d657e59-4e58-4c85-be7c-3af7dcc0dc0f.mat ,80 ,400 ,twocontrollers ,aft
f426762d-8767-4e1e-89d6-b2aaa3e3a856.mat ,60 ,100 ,two_feedbacks ,*
26be1950-9f83-4578-9060-65fb29af0403.mat ,70 ,450 ,* ,low
25c28395-6ab3-48df-838f-343439ffb4a0.mat ,99 ,440 ,fivecontrollers ,low
If I read the csv with readtable it works correcly, apart one thing that I need. As you can see, there's a numbers column that's represented only by numbers. Instead, for the business logic of my application, it must be a column of strings.
In addition to this, CSV can have different fields, so I don't know the number and name of columns before loading the csv. So I can't use the Format option because I don't know the number of columns before loading the table. Since it's a batch that process a very large number of csv files I'd like to avoid to read the csv file twice, one for retrieving the number of columns in order to check how many %s should be added in the Format option and then reloading the csv gile again with the right Format string.
Is there a way to tell matlab to load a csv into a table with readtable by handling every column as a column of strings without knowing their name?

Accepted Answer

Johannes Hougaard
Johannes Hougaard on 16 Apr 2020
Hi Daniele
I have provided a function for you that seems to do the trick (attached and in code-block below).
Feel free to use it without any limitations - and please alter it as you wish (no rights reserved)
This may not be the most elegant solution but as it is based on textscan it should be fairly robust, customizable and reasonably fast.
Be aware that it is currently hard-coded that the delimiter is ',' as in your example (this can be changed in line 34 and 35 if desired - and maybe turned into an input if it changes often).
Error handling, testing and help is kept at a minimum.
function outputtable = csvfile2table(csvfile)
% CSVFILE2TABLE Reads a CSV file as text into a MATLAB table using the first line as variable names
% Similar to readtable but keeps all data as text rather than converting numerical data
%
% I/O
% outputtable = csvfile2table(csvfile);
% OR
% outputtable = csvfile2table;
%
% INPUT
% csvfile = the full filename and path of the file to be read.
%
% OUTPUT
% outputtable = a table array containing the data from the CSV file as texts
%
% 2020-04-16 XJHH@novonordisk.com
%
% see also readtable textscan
if nargin == 0 || isempty(csvfile)
[filename,filepath] = uigetfile({'*.csv' 'CSV file (*.csv)'},'Select .CSV file to read into table');
csvfile = fullfile(filepath,filename);
end
fid = fopen(csvfile);
if fid <= 0
err = MException([mfilename,':MissingFile'],'File %s is not existing or cannot be read',csvfile);
throw(err);
end
filecontent = textscan(fid,'%s','delimiter','\n');
fclose(fid);
tablecontent = cellfun(@(x)textscan(x,'%s','delimiter',','),filecontent{1}(2:end));
tableheaders = cellfun(@(x)textscan(x,'%s','delimiter',','),filecontent{1}(1));
outputtable = cell2table(strtrim(horzcat(tablecontent{:})'),'VariableNames',tableheaders{1});

More Answers (1)

Peng Li
Peng Li on 16 Apr 2020
you could call detectImpactOptions first, and modify format from there before readtable.
  2 Comments
Peng Li
Peng Li on 16 Apr 2020
how about using sort of lower level function first, e.g., fgetl(), and parse the string to see how many variables you have. After that you can construct a format string for readtable.

Sign in to comment.

Products


Release

R2014a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by