Lack of fit with fitrlinear on multivariate data (version 2016a and later)

2 vues (au cours des 30 derniers jours)
Jeffrey Hawkes
Jeffrey Hawkes le 19 Déc 2017
Commenté : Jeffrey Hawkes le 22 Déc 2017
I'm trying to use this function, fitrlinear, to develop a linear regression model to predict a variable x1. There are 15 predictor variables (y1:y15) and 74 observations of each. I'm attaching a csv of the data.
% code
pred=readtable('pred2.csv','Delimiter',';');
predX=table2array(pred);
x1=predX(:,1);
y=predX(:,2:end);
%
cvp = cvpartition(74,'Holdout',0.05);
idxTrain = training(cvp); % Extract training set indices
y = y';
Mdl = fitrlinear(y(:,idxTrain),x1(idxTrain),'ObservationsIn','columns');
%
idxTest = test(cvp); % Extract test set indices
yHat = predict(Mdl,y(:,idxTest),'ObservationsIn','columns');
L = loss(Mdl,y(:,idxTest),x1(idxTest),'ObservationsIn','columns')
This gives an enormous mean squared error (L, at the end), and I can see that the predicted values in yHat are far off. Most of this code is taken from the Matlab examples and tutorials on how to run this function... what am I missing?
perhaps you can suggest a better way to predict this data.

Réponses (2)

Ben Drebing
Ben Drebing le 21 Déc 2017
I would recommend using the Regression Learner app in MATLAB. I find that it really helps when you want to quickly try a bunch of different models on some data. You can get to it by typing
>> regressionLearner

Ilya
Ilya le 21 Déc 2017
Your test set has floor(74*0.05)=3 observations. You can't measure error of any model on such a tiny test set.
  1 commentaire
Jeffrey Hawkes
Jeffrey Hawkes le 22 Déc 2017
Hi llya, thanks for the input. If I change the test set to 20 or 30% of the observations, it doesn't help. One of the variables, y4, is highly correlated with x1, so this model shouldn't be difficult to run and get a good prediction of x1.

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by