Machine learning and data normalization - how data should(?) be normalized.
6 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hello, I have a general question about data normalization for classification algorithms: if I have a training set and a testing set, should I normalize them separately or join them for normalization step? And what if later I would like to use this classifier to classify a totally new portion of data? Should I keep extreme values of each feature to use them for normalization?
Second question I have: Is normalization really necessary? Does SVM need it?
Thank you in advance for any help. Cheers, Michael
0 commentaires
Réponses (2)
Mostafa Nakhaei
le 18 Oct 2019
Please note that the best practice in machine learning is to keep the distribution of testing and training the same. So, if you want to normalize your data, it is good to do the normalization on whole dataset first and then separate them. thus, your testing and training will have the same distribution. The common error is to separate the data and then normalize them individually.
0 commentaires
BERGHOUT Tarek
le 3 Fév 2019
1-you can normalize the eparately or together but the best way is to normalize the inside the trainig function ; if you add the normelization function inside the trainig function , you can use it for any dataset after that .
2- yes normalization alwaze necesery if and ownly if the activation fuinctions of your training model are bounded otherwise you don't have to normelize tham;
and for SVM if the kerenel function is bounded you must normelize you data.
0 commentaires
Voir également
Catégories
En savoir plus sur Statistics and Machine Learning Toolbox dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!