coefTest

Linear hypothesis test on generalized linear regression model coefficients

Description

example

p = coefTest(mdl) computes the p-value for an F test that all coefficient estimates in mdl, except the intercept term, are zero.

example

p = coefTest(mdl,H) performs an F-test that H × B = 0, where B represents the coefficient vector. Use H to specify the coefficients to include in the F-test.

p = coefTest(mdl,H,C) performs an F-test that H × B = C.

[p,F] = coefTest(___) also returns the F-test statistic F using any of the input argument combinations in previous syntaxes.

[p,F,r] = coefTest(___) also returns the numerator degrees of freedom r for the test.

Examples

collapse all

Fit a generalized linear regression model, and test the coefficients of the fitted model to see if they differ from zero.

Generate sample data using Poisson random numbers with two underlying predictors X(:,1) and X(:,2).

rng('default') % For reproducibility
rndvars = randn(100,2);
X = [2 + rndvars(:,1),rndvars(:,2)];
mu = exp(1 + X*[1;2]);
y = poissrnd(mu);

Create a generalized linear regression model of Poisson data.

mdl = fitglm(X,y,'y ~ x1 + x2','Distribution','poisson')
mdl =
Generalized linear regression model:
log(y) ~ 1 + x1 + x2
Distribution = Poisson

Estimated Coefficients:
Estimate       SE        tStat     pValue
________    _________    ______    ______

(Intercept)     1.0405      0.022122    47.034      0
x1              0.9968      0.003362    296.49      0
x2               1.987     0.0063433    313.24      0

100 observations, 97 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 2.95e+05, p-value = 0

Test whether the fitted model has coefficients that differ significantly from zero.

p = coefTest(mdl)
p = 4.1131e-153

The small p-value indicates that the model fits significantly better than a degenerate model consisting of only an intercept term.

Fit a generalized linear regression model, and test the significance of a specified coefficient in the fitted model.

Generate sample data using Poisson random numbers with two underlying predictors X(:,1) and X(:,2).

rng('default') % For reproducibility
rndvars = randn(100,2);
X = [2 + rndvars(:,1),rndvars(:,2)];
mu = exp(1 + X*[1;2]);
y = poissrnd(mu);

Create a generalized linear regression model of Poisson data.

mdl = fitglm(X,y,'y ~ x1 + x2','Distribution','poisson')
mdl =
Generalized linear regression model:
log(y) ~ 1 + x1 + x2
Distribution = Poisson

Estimated Coefficients:
Estimate       SE        tStat     pValue
________    _________    ______    ______

(Intercept)     1.0405      0.022122    47.034      0
x1              0.9968      0.003362    296.49      0
x2               1.987     0.0063433    313.24      0

100 observations, 97 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 2.95e+05, p-value = 0

Test the significance of the x1 coefficient. According to the model display, x1 is the second predictor. Specify the coefficient by using a numeric index vector.

p = coefTest(mdl,[0 1 0])
p = 2.8681e-145

The returned p-value indicates that x1 is statistically significant in the fitted model.

Input Arguments

collapse all

Generalized linear regression model, specified as a GeneralizedLinearModel object created using fitglm or stepwiseglm, or a CompactGeneralizedLinearModel object created using compact.

Hypothesis matrix, specified as an r-by-s numeric index matrix, where r is the number of coefficients to include in an F-test, and s is the total number of coefficients.

• If you specify H, then the output p is the p-value for an F-test that H × B = 0, where B represents the coefficient vector.

• If you specify H and C, then the output p is the p-value for an F-test that H × B = C.

Example: [1 0 0 0 0] tests the first coefficient among five coefficients.

Data Types: single | double

Hypothesized value for testing the null hypothesis, specified as a numeric vector with the same number of rows as H.

If you specify H and C, then the output p is the p-value for an F-test that H × B = C, where B represents the coefficient vector.

Data Types: single | double

Output Arguments

collapse all

p-value for the F-test, returned as a numeric value in the range [0,1].

Value of the test statistic for the F-test, returned as a numeric value.

Numerator degrees of freedom for the F-test, returned as a positive integer. The F-statistic has r degrees of freedom in the numerator and mdl.DFE degrees of freedom in the denominator.

Algorithms

The p-value, F-statistic, and numerator degrees of freedom are valid under these assumptions:

• The data comes from a model represented by the formula in the Formula property of the fitted model.

• The observations are independent, conditional on the predictor values.

Under these assumptions, let β represent the (unknown) coefficient vector of the linear regression. Suppose H is a full-rank matrix of size r-by-s, where r is the number of coefficients to include in an F-test, and s is the total number of coefficients. Let c be a column vector with r rows. The following is a test statistic for the hypothesis that  = c:

$F={\left(H\stackrel{^}{\beta }-c\right)}^{\prime }{\left(HV{H}^{\prime }\right)}^{-1}\left(H\stackrel{^}{\beta }-c\right).$

Here $\stackrel{^}{\beta }$ is the estimate of the coefficient vector β, stored in the Coefficients property, and V is the estimated covariance of the coefficient estimates, stored in the CoefficientCovariance property. When the hypothesis is true, the test statistic F has an F Distribution with r and u degrees of freedom, where u is the degrees of freedom for error, stored in the DFE property.

Alternative Functionality

The values of commonly used test statistics are available in the Coefficients property of a fitted model.

Extended Capabilities

Introduced in R2012a