testDeviance

Deviance test for multinomial regression model

Since R2023a

Syntax

p = testDeviance(mdl)

[p,testStat] = testDeviance(mdl)

Description

p = testDeviance(mdl) returns the p-value for a test that determines whether the fitted model in the MultinomialRegression model object mdl fits significantly better than an intercept-only model.

example

[p,testStat] = testDeviance(mdl) also returns the value of the test statistic used to generate the p-value.

example

Examples

collapse all

Perform Deviance Test

Open Live Script

Load the fisheriris sample data set.

load fisheriris

The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains of four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

Fit a multinomial regression model using meas as the predictor data and species as the response data.

mdl = fitmnr(meas,species);

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform a chi-squared test with the null hypothesis that an intercept-only model performs as well as the model mdl.

p = testDeviance(mdl)

p = 
7.0555e-64

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

Get Test Statistic for Deviance Test

Open Live Script

Load the carbig sample data set.

load carbig

The variables MPG and Origin contain data for car mileage and country of origin, respectively.

Fit a multinomial regression model with MPG as the predictor data and Origin as the response. Estimate the dispersion parameter during the fitting.

mdl = fitmnr(MPG,Origin,EstimateDispersion=true);

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform an F-test with the null hypothesis that an intercept-only model fits the data as well as the model mdl. Display the p-value and the F-statistic.

[p,tStats] = testDeviance(mdl)

p = 
1.2314e-45

tStats = 
39.1789

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

Input Arguments

collapse all

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

Output Arguments

collapse all

`p` — Deviance test p-value
numeric scalar in the range [0,1]

Deviance test p-value, returned as a numeric scalar in the range [0,1].

`testStat` — Deviance test statistic
numeric scalar

Deviance test statistic, returned as a numeric scalar. If mdl.Dispersion is estimated, testDeviance performs an F-test to determine whether the fitted model mdl fits better than an intercept-only model. If mdl.Dispersion is not estimated, testDeviance performs a chi-squared test instead.

More About

collapse all

Deviance

Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to a saturated model.

The deviance of a model M₁ is twice the difference between the loglikelihood of the model M₁ and the saturated model M_s. A saturated model is a model with the maximum number of parameters that you can estimate.

For example, if you have n observations (y_i, i = 1, 2, ..., n) with potentially different values for X_i^Tβ, then you can define a saturated model with n parameters. Let L(b,y) denote the maximum value of the likelihood function for a model with the parameters b. Then the deviance of the model M₁ is

$- 2 (\log L (b_{1}, y) - \log L (b_{S}, y)),$

where b₁ and b_s contain the estimated parameters for the model M₁ and the saturated model, respectively. The deviance has a chi-squared distribution with n – p degrees of freedom, where n is the number of parameters in the saturated model and p is the number of parameters in the model M₁.

Assume you have two different generalized linear regression models M₁ and M₂, and M₁ has a subset of the terms in M₂. You can assess the fit of the models by comparing their deviances D₁ and D₂. The difference of the deviances is

$\begin{array}{l} D = D_{2} - D_{1} = - 2 (\log L (b_{2}, y) - \log L (b_{S}, y)) + 2 (\log L (b_{1}, y) - \log L (b_{S}, y)) \\ = - 2 (\log L (b_{2}, y) - \log L (b_{1}, y)) . \end{array}$

Asymptotically, the difference D has a chi-squared distribution with degrees of freedom v equal to the difference in the number of parameters estimated in M₁ and M₂. You can obtain the p-value for this test by using 1 — chi2cdf(D,v).

Typically, you examine D using a model M₂ with a constant term and no predictors. Therefore, D has a chi-squared distribution with p – 1 degrees of freedom. If the dispersion is estimated, the difference divided by the estimated dispersion has an F distribution with p – 1 numerator degrees of freedom and n – p denominator degrees of freedom.

Alternative Functionality

coefTest performs an F-test to determine whether the coefficient estimates in mdl are zero. If you do not specify coefficients to test, coefTest tests whether the model mdl is a better fit to the data than a model with no coefficients.

Version History

Introduced in R2023a

testDeviance

Syntax

Description

Examples

Perform Deviance Test

Get Test Statistic for Deviance Test

Input Arguments

mdl — Multinomial regression model object MultinomialRegression model object

Output Arguments

p — Deviance test p-value numeric scalar in the range [0,1]

testStat — Deviance test statistic numeric scalar

More About

Deviance

Alternative Functionality

Version History

See Also

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

`p` — Deviance test p-value
numeric scalar in the range [0,1]

`testStat` — Deviance test statistic
numeric scalar