Pseudo R-squared measure for Poisson regression models

Computes pseudo R-squared goodness-of-fit measure for Poisson regression models from real and estimated data
188 téléchargements
Mise à jour 22 août 2018

Afficher la licence

function pR2 = pseudoR2( realData, estimatedData, lambda )

computes pseudo R-squared (pR2) goodness-of-fit measure for Poisson regression models from real and estimated data according to [1, page 255, first equation].
Pseudo R-squared measure was introduced in [3] to evaluate goodness of fit for Poisson regressions models, see also [1,2] where adjusted pR2 measure was introduced for Poisson regression models with over- or under-dispersion. Poisson regression models are often considered to model count data [1], and, in particular, spike data [4,5,6,8]. Pseudo R-squared values can be interpreted as the relative reduction in deviance due to the added to the model covariates [5]. Pseudo R-squared measure was used as goodness-of-fit measure when predicting spike counts in [4,5,6,8].

INPUT
- realData - observed values of the dependent variable (1xN values);
- estimatedData - estimated values (1xN values);
- lambda - mean value over realData (1x1 value).

OUTPUT
- pR2 - value of pseudo R-squared measure (1x1 value);

EXAMPLE OF USE
% 'arsdata_1950_2010.xls' is from
% http://www.maths.lth.se/matstat/kurser/fmsf60/_Labfiles/arsdata_1950_2010.xls

data = xlsread( 'arsdata_1950_2010.xls' ); % read data
startPoint = 26;
traffic = struct( 'year', data( startPoint:end, 1 ), 'killed', ...
data( startPoint:end, 2 ), 'cars', data( startPoint:end, 5 ), ...
'petrol', data( startPoint:end, 6 ) );
y = traffic.killed;
x = cell( 1, 3 ); % covariates or predictors
estCoeff = cell( 1, 3 ); % estimated coefficients of model fit
yEstimated = cell( 1, 3 );
pR2value = zeros( 1, 3 );
x{ 1 } = traffic.year - mean( traffic.year );
x{ 2 } = [ x{ 1 }, traffic.cars - mean( traffic.cars ) ];
x{ 3 } = [ x{ 2 }, traffic.petrol - mean( traffic.petrol ) ];
for iCovariate = 1:3
% leave-one-out cross-validation
for iPoint = 1:length( y )
trainingSet = [ 1:iPoint - 1 iPoint+1:length( y ) ];
estCoeff{ iCovariate } = glmfit( x{ iCovariate }( trainingSet, : ), ...
y( trainingSet ), 'poisson', 'link', 'log' );
yEstimated{ iCovariate }( iPoint ) = glmval( estCoeff{ iCovariate }, ...
x{ iCovariate }( iPoint, : ), 'log' );
end
pR2value( iCovariate ) = pseudoR2( y', yEstimated{ iCovariate }, mean( y ) );
end

% plot results
fontSize = 20;

figure;
plot( traffic.year, traffic.killed, '-', 'LineWidth', 4 ); hold on;
for iCovariate = 1:3
plot( traffic.year, yEstimated{ iCovariate }, 'o', 'LineWidth', 4, ...
'markerSize', 8 );
end
xlabel( 'Year', 'FontSize', fontSize );
ylabel( 'Number of people killed in accidients', 'FontSize', fontSize );
legendHandle = legend( 'Real', ...
[ 'Estimated from year, $pR^2$ = ' num2str( pR2value( 1 ), '%.3f' ) ], ...
[ 'Estimated from year and cars, $pR^2$ = ' num2str( pR2value( 2 ), '%.3f' ) ], ...
[ 'Estimated from year, cars and petrol, $pR^2$ = ' num2str( pR2value( 3 ), '%.3f' ) ] );
legendHandle.FontSize = fontSize;
set( legendHandle, 'Interpreter', 'Latex' );
set( gca, 'FontSize', fontSize );

REFERENCES

[1] Heinzl, H. and Mittlboeck, M., 2003. Pseudo R-squared measures for Poisson regression models with over-or underdispersion. Computational statistics & data analysis, 44(1-2), pp.253-271.
[2] Mittlböck, M. (2002). Calculating adjusted R2 measures for Poisson regression models. Computer Methods and Programs in Biomedicine, 68(3), 205-214.
[3] Cameron, A.C. and Windmeijer, F.A., 1996. R-squared measures for count data regression models with applications to health-care utilization. Journal of Business & Economic Statistics, 14(2), pp.209-220.
[4] Benjamin, A.S., Fernandes, H.L., Tomlinson, T., Ramkumar, P., VerSteeg, C., Miller, L. and Kording, K.P., 2017. Modern machine learning far outperforms GLMs at predicting spikes. bioRxiv, p.111450.
[5] Fernandes, H.L., Stevenson, I.H., Phillips, A.N., Segraves, M.A. and Kording, K.P., 2013. Saliency and saccade encoding in the frontal eye field during natural scene search. Cerebral Cortex, 24(12), pp.3232-3245.
[6] http://www.math.chalmers.se/Stat/Grundutb/CTH/mve300/1112/files/lab4/lab4.pdf
[7] Glaser, J. I., Perich, M. G., Ramkumar, P., Miller, L. E., & Kording, K. P. (2018). Population coding of conditional probability distributions in dorsal premotor cortex. Nature communications, 9(1), 1788.
[8] Ramkumar, P., Lawlor, P.N., Glaser, J.I., Wood, D.K., Phillips, A.N., Segraves, M.A. and Kording, K.P., 2016. Feature-based attention and spatial selection in frontal eye fields during natural scene search. Journal of neurophysiology, 116(3), pp.1328-1343.

Citation pour cette source

Valentina Unakafova (2024). Pseudo R-squared measure for Poisson regression models (https://www.mathworks.com/matlabcentral/fileexchange/67041-pseudo-r-squared-measure-for-poisson-regression-models), MATLAB Central File Exchange. Récupéré le .

Compatibilité avec les versions de MATLAB
Créé avec R2016a
Compatible avec toutes les versions
Plateformes compatibles
Windows macOS Linux
Remerciements

A inspiré : Importance of cross-validation

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Publié le Notes de version
1.1.3

Example of use has been changed (leave-one-out cross-validation has been added)

1.1.2

Cover picture has been renewed

1.1.1

-

1.1

Example of use has been shortened and made more readable

1.0.2.2

Misprints in the description have been corrected

1.0.2.1

Description has been updated

1.0.2.0

-

1.0.1.0

-

1.0.0.0