Proper use of ClassificationTree.fit for categorical variables?
Afficher commentaires plus anciens
The documentation for fitting classification trees states that X needs to be a floating point array, but also indicates that X can represent categorical variables (using the 'CategoricalPredictors' Name-Value argument).
Is the proper way to handle this to
(1) take the categorical variable, e.g.
category1 = {'duck','duck','goose','squash','quartz'}';
category2 = {'animal','animal','animal','vegetable','mineral'}';
(2) run those through grp2idx()
numcat1 = grp2idx(category1);
numcat2 = grp2idx(category2);
(3) Embed those in my X:
X = [numcat1 numcat2 otherTrulyNumericalVariables]
(4) Identify those as categorical
tree = ClassificationTree.fit(X,Y,'CategoricalPredictors',[1 2])
Seems like that's probably right, but I'd love an expert to vet that idea. The documentation doesn't have a categorical example.
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Text Data Preparation dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!