Main Content

# testDeviance

Deviance test for multinomial regression model

Since R2023a

## Description

example

p = testDeviance(mdl) returns the p-value for a test that determines whether the fitted model in the MultinomialRegression model object mdl fits significantly better than an intercept-only model.

example

[p,testStat] = testDeviance(mdl) also returns the value of the test statistic used to generate the p-value.

## Examples

collapse all

Load the fisheriris sample data set.

load fisheriris

The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains of four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

Fit a multinomial regression model using meas as the predictor data and species as the response data.

mdl = fitmnr(meas,species);

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform a chi-squared test with the null hypothesis that an intercept-only model performs as well as the model mdl.

p = testDeviance(mdl)
p = 7.0555e-64

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

Load the carbig sample data set.

load carbig

The variables MPG and Origin contain data for car mileage and country of origin, respectively.

Fit a multinomial regression model with MPG as the predictor data and Origin as the response. Estimate the dispersion parameter during the fitting.

mdl = fitmnr(MPG,Origin,EstimateDispersion=true);

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform an F-test with the null hypothesis that an intercept-only model fits the data as well as the model mdl. Display the p-value and the F-statistic.

[p,tStats] = testDeviance(mdl)
p = 1.2314e-45

tStats = 39.1789

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

## Input Arguments

collapse all

Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

## Output Arguments

collapse all

Deviance test p-value, returned as a numeric scalar in the range [0,1].

Deviance test statistic, returned as a numeric scalar. If mdl.Dispersion is estimated, testDeviance performs an F-test to determine whether the fitted model mdl fits better than an intercept-only model. If mdl.Dispersion is not estimated, testDeviance performs a chi-squared test instead.

## More About

collapse all

### Deviance

Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to a saturated model.

The deviance of a model M1 is twice the difference between the loglikelihood of the model M1 and the saturated model Ms. A saturated model is a model with the maximum number of parameters that you can estimate.

For example, if you have n observations (yi, i = 1, 2, ..., n) with potentially different values for XiTβ, then you can define a saturated model with n parameters. Let L(b,y) denote the maximum value of the likelihood function for a model with the parameters b. Then the deviance of the model M1 is

$-2\left(\mathrm{log}L\left({b}_{1},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right),$

where b1 and bs contain the estimated parameters for the model M1 and the saturated model, respectively. The deviance has a chi-squared distribution with np degrees of freedom, where n is the number of parameters in the saturated model and p is the number of parameters in the model M1.

Assume you have two different generalized linear regression models M1 and M2, and M1 has a subset of the terms in M2. You can assess the fit of the models by comparing their deviances D1 and D2. The difference of the deviances is

$\begin{array}{l}D={D}_{2}-{D}_{1}=-2\left(\mathrm{log}L\left({b}_{2},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right)+2\left(\mathrm{log}L\left({b}_{1},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right)\\ \text{ }\text{ }\text{ }\text{\hspace{0.17em}}\text{\hspace{0.17em}}=-2\left(\mathrm{log}L\left({b}_{2},y\right)-\mathrm{log}L\left({b}_{1},y\right)\right).\end{array}$

Asymptotically, the difference D has a chi-squared distribution with degrees of freedom v equal to the difference in the number of parameters estimated in M1 and M2. You can obtain the p-value for this test by using 1  —  chi2cdf(D,v).

Typically, you examine D using a model M2 with a constant term and no predictors. Therefore, D has a chi-squared distribution with p – 1 degrees of freedom. If the dispersion is estimated, the difference divided by the estimated dispersion has an F distribution with p – 1 numerator degrees of freedom and np denominator degrees of freedom.

## Alternative Functionality

coefTest performs an F-test to determine whether the coefficient estimates in mdl are zero. If you do not specify coefficients to test, coefTest tests whether the model mdl is a better fit to the data than a model with no coefficients.

## Version History

Introduced in R2023a