Question on CATEGORICAL and Help files

Well, not a question as much as a comment. I'm finding the Help files a bit too brief for this class. Does any one else agree?
For example:
  • How to preallocate a categorical array? I'm dealing with 13,000,000 records and moving the data into variables in 32 bit machine hits the memory limit quickly. Preallocation is critical inthis application.
  • How to turn a categorical array back into numbers of characters? Hehe, trying to use the variable editor and paste the categorical data into a column of another (non categorical) variable, froze the machine (or was it a crash?).

 Réponse acceptée

Sean de Wolski
Sean de Wolski le 20 Août 2014
I usually preallocate tables or categoricals dynamically by running my loop backward
T = table;
for ii = 10:-1:1;
T(ii,:) = array2table(rand(1,4));
end
To your second question:
c = categorical({'red';'red';'blue'})
cellstr(c)

9 commentaires

Georges
Georges le 20 Août 2014
Good to know can use T=Categorical without arguments. Nice simple trick to loop backwards.
Simple and obvious, thanks.
dpb
dpb le 20 Août 2014
How does running loop backwards help on categoricals? Don't see that...
@dpb
C = categorical;
for ii = 10:-1:1
C(ii,:) = char(randi(10)+65);
end
If you don't have all of the values up front (say we're reading from 100 files), this allows you to preallocate elegantly. At least its the most elegant thing I've found :)
>> c=categorical
Error using categorical
Abstract classes cannot be instantiated. Class 'categorical' defines abstract methods
and/or properties.
>>
This must be newer feature.
Sean de Wolski
Sean de Wolski le 21 Août 2014
Yes, you are likely using categorical In The Stats Toolbox from < R2013b when the new one (along with tables) was introduced into base MATLAB.
dpb
dpb le 21 Août 2014
This is 12b, indeed. It brought my old machine almost to its knees so have been reluctant to upgrade further figuring performance would degrade even further...
Sean de Wolski
Sean de Wolski le 21 Août 2014
The machine or the ML desktop?
dpb
dpb le 21 Août 2014
Both...I'm not at all enamored of the idea of changing the UI, either...I use very little of it other than keyboard, anyway.
In addition to what Sean said about preallocating the memory using a backwards loop
  1. you can also assign any value to the last element and then run the loop in the usual direction, and
  2. it will be a performance gain to also preallocate the categories if you know them in advance.
So
c(1000,1) = categorical('',{'abc' 'def' 'ghi'});
Presto, a 1000x1 categorical array. You could of course also do this
c = categorical(repmat({''},1000,1),{'abc' 'def' 'ghi'});
but there's no real reason to.

Connectez-vous pour commenter.

Plus de réponses (1)

dpb
dpb le 20 Août 2014
a) Can't -- nominal or ordinal create the categorical array from an existing array--no other method is provided.
b) double and various intXX are numeric conversions; cellstr or char for character data
See
doc categorical
for details.
If data are character, you may in the end save memory with such manipulations as
x=nominal(x);
if x is a character variable but you'll have to have the original x initially. If you're running into memory problems loading the data to begin with, about all you can do that I can think of is to load it piecemeal, convert to categorical with a (hopefully sizable) savings in memory that then allows you to load some more.
Or, of course, find a way to process the data other than "all in one swell foop"

Catégories

En savoir plus sur Startup and Shutdown dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by