Why SVM is not giving expected result

4 vues (au cours des 30 derniers jours)
Diver
Diver le 26 Oct 2015
Modifié(e) : Ilya le 27 Oct 2015
I have training data composed from only one feature.
The feature have around 113K observation.
  • 8K only of those observation have positive class.
  • 105K of those observation have negative class.
  • The 8K observation composed of a number below 1 (90%), and 10% above 1
  • The 105K observation composed of a number above 1 (80%), and 20% below 1Hence, almost, any X value below than 1 show be predicted as positive class, and any X value above 1 should be predicted as negative class.
I used the following fitcsvm call:
svmStruct = fitcsvm(X,Y,'Standardize',true, 'Prior','uniform','KernelFunction','linear','KernelScale','auto','Verbose',1,'IterationLimit',1000000);
the fitcsvm give message at the end saying SVM optimization did not converge to the required tolerance., ... but why ... most of first class X values are below 1 and visa versa ... so it should be easy to find classification boundary. and when I run:
[label,score,cost]= predict(svmStruct, X) ;
it gives wrong prediction.
Below a portion of my X values is listed:
0.9911
0.9836
0.9341
0.9751
0.9880
0.9977
0.9853
0.9861
1.0143
1.0086
0.9594
0.9787
0.9927
0.9839
1.0024
0.9931
0.9930
1.0275
The image below shows a gscatter diagram. Notice, the positive values are only around 8K, while negative around 105K. Since there is only 1 feature, I created an X with values from 1 to length of Y.
I also attached "features.txt" which contains the features column and "Y.txt" which contains the two groups.

Réponse acceptée

Ilya
Ilya le 27 Oct 2015
Modifié(e) : Ilya le 27 Oct 2015
This is a difficult problem for SVM. SVM performs best when two classes are separable or have a modest overlap. This is not the case here. To make things even harder for SVM, less than 7000 points out of your 110k are unique.
Why not use a classifier such as decision tree or linear discriminant?

Plus de réponses (0)

Catégories

En savoir plus sur Statistics and Machine Learning Toolbox dans Help Center et File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by