On which data is the ML model trained after hyperparameter optimization /Application of trained ML on new training data

5 views (last 30 days)
Dear Matlab-Community,
I would be glad if someone could help me out with two questions. Let us regard the "fitcdiscr" function.
1.) On which data is the final ML model trained after hyperparameter optimization? Concretely asked, which is true:
a) The final model is trained just on the defined training set, which is 80% of my data. The optimal hyperparameters after i iterations are taken. (That means we have i model trainings using hold-out validaton)
b) The final model is trained on the entire data set with the the optimal hyperparameters after i iterations. (That means we have i + 1 model trainings using hold-out validaton)
2.) How can I directly apply my model with the Hyperparameters for training it on new data?
I have added a code snippet below.
I am grateful for any hints!
Thank you
Partition = cvpartition(Data.Response,"HoldOut",0.2, 'Stratify',true);
TrainingSetting.Discr.OptimizationOptions = struct('CVPartition',Partition,'MaxObjectiveEvaluations',30); % 20% Hold-Out-Partition
Model = fitcdiscr(X, Y,'HyperparameterOptimizationOptions',TrainingSetting.Discr.OptimizationOptions)

Answers (1)

Alan Weiss
Alan Weiss on 9 Dec 2022
With the settings you show, the software does not perform any cross validation. You need to set the OptimizeHyperparameters argument to something other than the default 'none' when you call fitcdiscr.
Assuming you set something such as 'auto', which as documented varies 'Delta' and 'Gamma' to minimize cross-validation loss, what happens is that the software first tries to minimize the cross-validation loss, and then performs one more step to fit the data using the resulting hyperparameters.
Alan Weiss
MATLAB mathematical toolbox documentation

Sign in to comment.




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by