Multinomial logistic regression

5 vues (au cours des 30 derniers jours)
eli
eli le 30 Août 2011
Dear all,
I have a (X) as a matrix and (Y) as a vector for example:
X=[1.4 5.2 7.1 3.9; 0.5 2.9 6.8 3.2], Y=[0 0 0 1]
I want to do the Multinomial logistic regression. That is my method:
[b,dev,stats]=glmfit(X,Y,'binomial','link','logit');
XX=linspace(0,7);
yfit=glmval(b,XX,'logit');
plot(X,Y,'o',XX,yfit,'-','LineWidth',2);
Does my method seems correct? but I am not sure about the function ''glmfit'' or ''mnrfit'' and also plotting. Please guide me in this case. Thanks in advance.
eli

Réponses (3)

Daniel Shub
Daniel Shub le 30 Août 2011
The code you have posted looks like logistic regression and your plotting looks reasonable. The documentation for glmfit is a reasonable starting point to understanding logistic regression.
  1 commentaire
eli
eli le 30 Août 2011
Daniel,
But I can not plot it. There is an error on ''yfit'' and I think some thing is wrong with XX=linspace(0,7) maybe It is not appropriate for matrix ''X''. Do you have any idea to solve the problem? Thanks
eli

Connectez-vous pour commenter.


Richard Willey
Richard Willey le 30 Août 2011
The following example deals with Poisson regression rather than logistic regression. I'm posting this because it includes a fair amount of plotting and might prove useful to get a feel for things.
% Generalized Linear Models ecompasses techniques like Poisson
% regression and Logistic regression. This technique is used to model
% counts and estimate odds.
% Formally, the technique was developed to model data where the error terms
% don't exhibit constant variance.
% Specify the average relationship between X and Y
clear all
clc
% Create a set of 100 X values between 0 and 15
X = linspace(0,15,100);
X = X';
% Define a function that describes the average relationship between X and
% Y.
mu = @(x) exp(-1.5 +.25*x);
Y = mu(X);
plot(X,Y);
xlabel('Time');
ylabel('Count');
Use the Y = f(X) to generate a dataset with known characteristics
% At each "X", the observed Y is represented as a draw from a Poisson
% distribution with mean (lambda) = Y
Noisy_Y = poissrnd(Y,100,1);
% Create figure
figure1 = figure;
% Create axes
axes1 = axes('Parent',figure1,'YTick',[0:max(Noisy_Y)], 'YGrid','on');
hold(axes1,'all');
% Create scatter
scatter(X,Noisy_Y, 'b');
xlabel('Time')
ylabel('Count')
% Note the following:
% The mean of a poisson process is also the variance of a poisson process.
% The blue curve doesn't exhibit constant variance (this violates one of
% the key assumptions underlying Ordinary Least Squares)
% The output from a Poisson process is always a non-negative integer. If
% lambda is close to zero AND observations can never going
% non-negative, the shape of the distribution is bounded in one direction.
% As lambda gets larger, this boundary has less and less impact on the
% distribution of the errors.
hold on
plot(X,Y);
Create a fit using nonlinear regression
% The model used for the nonlinear regression is the same as "mu"
foo = fit(X, Noisy_Y, 'exp(B1 + B2*x)');
plot(foo)
xlabel('Time')
ylabel('Count')
SSE_OLS = sum((foo(X) - Noisy_Y).^2)
% Create textbox
annotation(figure1,'textbox',...
[0.147443514644351 0.802739726027397 0.255903765690377 0.0931506849315069],...
'String',{['SSE OLS = ' num2str(SSE_OLS)]},...
'FitBoxToText','off',...
'BackgroundColor',[1 1 1]);
%%Create a fit using a Generalized Linear Model
[b dev stats] = glmfit(X,Noisy_Y, 'poisson');
Predicted_Y = exp(b(1) + b(2)*X);
plot(X, Predicted_Y, 'k');
SSE_GLM = sum(stats.resid.^2)
% Create textbox
annotation(figure1,'textbox',...
[0.147443514644351 0.802739726027397 0.255903765690377 0.0931506849315069],...
'String',{['SSE OLS = ' num2str(SSE_OLS)], ['SSE GLM = ' num2str(SSE_GLM)]},...
'FitBoxToText','off',...
'BackgroundColor',[1 1 1]);
% The two curves (should) look almost the same
% The SSE is (essentially) the same.
% What gives? Why are we bothering with this GLM stuff?
%%Perform a Monte Carlo simulation
% Generate 1000 data sets that are consistent with with the assumed
% relationship between Count and Time
parfor i = 1:1000
Noisy_Y = poissrnd(Y, 100,1);
b = glmfit(X,Noisy_Y, 'poisson');
GLM_Prediction(:,i) = exp(b(1) + b(2)*X);
% Provide optimal starting conditions (make sure that ...
% any differences are related to the algorithms rather than
% convergence
foo = fit(X, Noisy_Y, 'exp(B1 + B2*x)', 'Startpoint', [-1.5, .25]);
B = coeffvalues(foo);
OLS_Prediction(:,i) = exp(B(1) + B(2)*X);
end
figure
True = plot(X, Y, 'b');
hold on
GLM_Mean = plot(X, mean(GLM_Prediction, 2),'k')
OLS_Mean = plot(X, mean(OLS_Prediction, 2),'r')
xlabel('Time')
ylabel('Count')
% The means of the two two techniques are, for all intents and purposes,
% identical.
% The standard deviation from the GLM is MUCH smaller. Also note that
% I provided the nonlinear regression with optimal starting conditions.
% Formally, the advantage to using a GLM has to do with the scope of the
% technique (The range of conditions over which the GLM will generate a
% good fit). GLMs have a much broader scope. Conversely, using a
% nonlinear regression to fit the inverse of the link function can (badly)
% misfire on occasion.
% glmfit is more robust
GLM_std = std(GLM_Prediction, [], 2);
OLS_std = std(OLS_Prediction, [], 2);
figure
plot(X, OLS_std./GLM_std)
xlabel('Time')
ylabel('Ratio of OLS std:GLM std')
  1 commentaire
eli
eli le 5 Sep 2011
Thanks Richard,
Unfortunately it is not what I mean, but however useful it the related cases :)
Regards,
eli

Connectez-vous pour commenter.


eli
eli le 5 Sep 2011
I found that in the case that I have a matrix as X and the vector Y as [0 1] event, the Multinomial logistic regression functions are as bellow :
I have a (X) as a matrix and (Y) as a vector for example:
X=[1.4 5.2 7.1 3.9; 0.5 2.9 6.8 3.2], Y=[0 0 0 1]
I want to do the Multinomial logistic regression. That is the right method in this case:
[b,dev,stats]=glmfit(X,Y,'binomial','link','logit');
PHAT = mnrval(b,X,'model','nominal','interactions','on');
but for plotting I have problem. I have been confused. It should be however 3D plot. The below instruction does not work:
plot(X3,Z3,'o',X3,yfit3,'-','LineWidth',2,'Color','green');
Can anybody give the valuable idea for plotting the Multinomial logistic regression. Thanks in advance.
eli

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by