How to access the data type of the each column of a table
26 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I have a table which contains categorical and numerical data. In order to separate those two, I want to know how to access the data type of the each column of a table.
Since I don't know how to, I tried function iscellstr as follows:
for i = 1:size(T,1)
for j=1:size(T,2)
CurrentData {i,j} = T{i,j}; %// Access all rows of a given column.
if iscategorical(CurrentData)
CategoricalData{i,j} = CurrentData{i,j};
elseif iscellstr(CurrentData)
StringcData{i,j} = CurrentData{i,j};
else
NumericData{i,j} = CurrentData{i,j};
end
end
end
But it does not seem to be working either. Any help would be appreciated.
1 commentaire
Jan
le 20 Oct 2014
It is overwhelming to read such a large amount of information. Shorter questions are more likely to be answered. Concentrate on the essential core of the problem an omit all details, which do not concern an answer.
Réponse acceptée
Peter Perkins
le 23 Oct 2014
Modifié(e) : Peter Perkins
le 23 Oct 2014
Ege, your question isn't all that clear. But I'm guessing that you don't want to be looping over every element of the table and putting things in a cell array. What about this:
>> t = table(randn(3,1),randn(3,2),categorical({'a';'a';'b'}),categorical({'x';'y';'y'}))
t =
Var1 Var2 Var3 Var4
_______ ______________________ ____ ____
2.7694 0.7254 -0.20497 a x
-1.3499 -0.063055 -0.12414 a y
3.0349 0.71474 1.4897 b y
>> numericVars = varfun(@isnumeric,t,'output','uniform')
numericVars =
1 1 0 0
>> tNumeric = t(:,numericVars)
tNumeric =
Var1 Var2
_______ ______________________
2.7694 0.7254 -0.20497
-1.3499 -0.063055 -0.12414
3.0349 0.71474 1.4897
>> tNonNumeric = t(:,~numericVars)
tNonNumeric =
Var3 Var4
____ ____
a x
a y
b y
Or
>> tNumeric = t{:,numericVars}
tNumeric =
2.7694 0.7254 -0.20497
-1.3499 -0.063055 -0.12414
3.0349 0.71474 1.4897
Hope this helps.
Plus de réponses (1)
Geoff Hayes
le 20 Oct 2014
I think that your code is almost there in terms of breaking down the data into categorial, string, and numeric types, it is just the line of code that initializes the CurrentData which may be causing a problem
CurrentData {i,j} = T{i,j};
Since CurrentData is a cell matrix, we are not getting the expected response when we say (for example) iscellstr(CurrentData) since it is not just one element that we are considering but the current one and all previous, some or all which might be cell arrays of strings.
What we should probably do instead is just consider the one (current) element as
CurrentData = T{i,j};
Then we use that element when we need to update the categorical, string, or numeric arrays. Your code would then become
CategoricalData = cell(size(T));
StringcData = cell(size(T));
NumericData = cell(size(T));
for u = 1:size(T,1)
for v=1:size(T,2)
CurrentData = T{u,v};
if iscategorical(CurrentData)
CategoricalData{u,v} = CurrentData;
elseif iscellstr(CurrentData)
StringcData{u,v} = CurrentData;
else
NumericData{u,v} = CurrentData;
end
end
end
The above code pre-sizes all arrays to that of the table. (This is a good habit to get into as performance may degrade when arrays are not pre-sized.) I replaced your indices of i and j with u and v since i and j are also used to represent the imaginary number. We then iterate through the all elements in the table and group the categorical, numeric, and string data together. You will note that if there is an empty column in your *Data array, then that means that that column is not (for example) numeric. I tried the above with the example
load patients
BloodPressure = [Systolic Diastolic];
T = table(Gender,Age,Smoker,BloodPressure,'RowNames',LastName);
4 commentaires
Geoff Hayes
le 23 Oct 2014
If your numerical data consists of mx1 columns only (for some m), then you could try
for u = 1:size(T,2)
CurrentData = T.(u)(1);
if iscategorical(CurrentData)
CategoricalData(:,u) = T.(u);
elseif iscellstr(CurrentData)
StringcData(:,u) = T.(u);
else
data = T{:,u};
if iscolumn(data)
NumericData(:,u) = num2cell(T{:,u});
else
% not sure how to handle non-column data
end
end
end
The problem with the above is to handle the case where the T{:,u} contains non-scalar elements, so something like
[ 1 3 ]
[ 2 4 ]
etc.
Voir également
Catégories
En savoir plus sur Data Type Identification dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!