real or categorical predictors, which one is faster?
Afficher commentaires plus anciens
In regressions, is there a guidline to treat predictors as real values or categorical?
In a fitting problem with input as X, y where X contains the hour of the day information, e.g. 1, 2, 3, etc.., I tend to consider it as a categorical predictor because the length of unique(X) is limited (i.e. 24). Surprislingly, the fitting procedures seem slower than treating it as real values in a gaussian process fitrgp.
My questions are:
- why does it take longer with categorical predictor?
- in a similar situation, is there a guidline to decide whether take the predictors as real values or categorical inputs?
3 commentaires
Walter Roberson
le 17 Sep 2023
Have you experimented with passing uint8 data? I don't know if that is permitted; if it is then it would signal that discrete algorithms are to be used
mono
le 17 Sep 2023
"why does it take longer with categorical predictor?"
I'd venture owing to the large number of dummy variables introduced by having 24 levels of time being modeled as categorical instead of continuous/discrete. You could try artificially reducing the same data set to 24, 12, 2 levels and see if that hypothesis is correct.
Regardless of whether it's true or not, it's still the model definition and purpose that should be controlling decisions such as this, not anything to do with compute time.
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Gaussian Process Regression dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!