# coefTest

Linear hypothesis test on multinomial regression model coefficients

Since R2023a

## Syntax

``p = coefTest(mdl)``
``p = coefTest(mdl,H)``
``p = coefTest(mdl,H,C)``
``````[p,F] = coefTest(___)``````
``````[p,F,r] = coefTest(___)``````

## Description

example

````p = coefTest(mdl)` computes the p-value for an F-test that all coefficient estimates in `mdl` are zero. ```
````p = coefTest(mdl,H)` performs an F-test that H × B = 0, where B represents the coefficient vector. Use `H` to specify the coefficients to include in the F-test.```
````p = coefTest(mdl,H,C)` performs an F-test that H × B = C. ```
``````[p,F] = coefTest(___)``` also returns the F-test statistic `F` using any of the input argument combinations in previous syntaxes.```

example

``````[p,F,r] = coefTest(___)``` also returns the numerator degrees of freedom `r` for the test.```

## Examples

collapse all

Load the `fisheriris` data set.

`load fisheriris`

The column vector `species` contains iris flowers of three different species: setosa, versicolor, and virginica. The matrix `meas` contains four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

Create a table from the iris measurements and species data by using the `array2table` function.

```tbl = array2table(meas,... VariableNames=["SepalLength","SepalWidth","PetalLength","PetalWidth"]); tbl.Species = species;```

Fit a multinomial regression model using the petal measurements as the predictor data and the species as the response data.

`mdl = fitmnr(tbl,"Species ~ PetalLength + PetalWidth^2")`
```mdl = Multinomial regression with nominal responses Value SE tStat pValue _______ ______ _______ __________ (Intercept_setosa) 136.9 12.587 10.876 1.4933e-27 PetalLength_setosa -17.351 7.0021 -2.478 0.013211 PetalWidth_setosa -77.383 24.06 -3.2163 0.0012987 PetalWidth^2_setosa -24.719 8.3324 -2.9666 0.0030111 (Intercept_versicolor) 8.2731 14.489 0.571 0.568 PetalLength_versicolor -5.7089 2.0638 -2.7662 0.0056709 PetalWidth_versicolor 35.208 21.97 1.6026 0.10903 PetalWidth^2_versicolor -14.041 7.1653 -1.9596 0.050037 150 observations, 292 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 309.3988, p-value = 7.9151e-64 ```

`mdl` is a multinomial regression model object that contains the results of the fitting a nominal multinomial regression model to the data. The chi-squared statistic and p-value correspond to the null hypothesis that the fitted model does not outperform a degenerate model consisting of only an intercept term. The large p-value indicates that not enough evidence exists to reject the null hypothesis.

Perform an F-test to test the null hypothesis that all coefficients, except the intercept term, are zero. Use the default 95% significance level.

`p = coefTest(mdl)`
```p = 3.5512e-133 ```

The small p-value in the output indicates that enough evidence exists to reject the null hypothesis that all coefficients are zero. Enough evidence exists to conclude that at least one of the fitted model coefficients is statistically significant at the 95% significance level.

Load the `carsmall` data set.

`load carsmall`

The variables `Acceleration`, `Weight`, and `Model_Year` contain data for car acceleration, weight, and model year, respectively. The variable `MPG` contains car mileage data in miles per gallon (MPG).

Sort the data in `MPG` into four response categories by using the `discretize` function.

```MPG = discretize(MPG,[9 19 29 39 48]); tbl = table(MPG,Acceleration,Weight,Model_Year);```

Fit a multinomial regression model of the car mileage as a function of the acceleration, weight, and model year.

`mdl = fitmnr(tbl,"MPG ~ Acceleration + Model_Year + Weight",CategoricalPredictors="Model_Year")`
```mdl = Multinomial regression with nominal responses Value SE tStat pValue ________ _________ _______ ___________ (Intercept_1) 160.06 15.697 10.197 2.0493e-24 Acceleration_1 -11.683 0.53323 -21.909 2.1299e-106 Weight_1 0.10169 0.0034745 29.267 2.7358e-188 Model_Year_76_1 189.24 4.5868 41.257 0 Model_Year_82_1 -1754.1 4.6231 -379.43 0 (Intercept_2) 183.55 14.211 12.916 3.6655e-38 Acceleration_2 -11.653 0.48884 -23.838 1.3423e-125 Weight_2 0.093348 0.0030349 30.758 9.4624e-208 Model_Year_76_2 194.1 4.2373 45.807 0 Model_Year_82_2 -141.57 3.4781 -40.702 0 (Intercept_3) 105.99 14.991 7.0701 1.5482e-12 Acceleration_3 -11.731 0.48805 -24.037 1.1292e-127 Weight_3 0.08341 0.0033652 24.786 1.2743e-135 Model_Year_76_3 293.57 4.7309 62.054 0 Model_Year_82_3 -36.451 4.0878 -8.9169 4.7948e-19 94 observations, 267 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 169.6193, p-value = 5.7114e-30 ```

`mdl` is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data. By default, the fourth response category is the reference category. Each row of the table output corresponds to the coefficient of the model term in the first column. The `tStat` and `pValue` columns contain the t-statistics and p-values, respectively, for the null hypothesis that the corresponding coefficient is zero. The small p-values for the `Model_Year` terms indicate that the model year has a statistically significant effect on `mdl`. For example, the p-value for the term `Model_Year_76_2` indicates that a car being manufactured in 1976 has a statistically significant effect on $\mathrm{ln}\left(\frac{{\pi }_{2}}{{\pi }_{4}}\right)$, where ${\pi }_{i}$ is the ith category probability.

You can use a numeric index matrix to investigate whether a group of coefficients contains a coefficient that is statistically significant. Use a numeric index matrix to test the null hypothesis that all coefficients corresponding to the `Model_Year` terms are zero.

```idx_Model_Year = [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0;... 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0;... 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0;... 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0;... 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0;... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1;... ]; [p_Model_Year,F_Model_Year,r_Model_Year] = coefTest(mdl,idx_Model_Year)```
```p_Model_Year = 0 ```
```F_Model_Year = 5.2752e+04 ```
```r_Model_Year = 6 ```

The returned p-value indicates that at least one of the category coefficients corresponding to `Model_Year` is statistically different from zero. This result is consistent with the small p-value for each of the `Model_Term` coefficients.

## Input Arguments

collapse all

Multinomial regression model object, specified as a `MultinomialRegression` model object created with the `fitmnr` function.

Hypothesis matrix, specified as a full-rank numeric index matrix of size r-by-s, where r is the number of linear combinations of coefficients being tested, and s is the total number of coefficients.

• If you specify `H`, then the output `p` is the p-value for an F-test that H × B = 0, where B represents the coefficient vector.

• If you specify `H` and `C`, then the output `p` is the p-value for an F-test that H × B = C.

Example: `[1 0 0 0 0]` tests the first coefficient among five coefficients.

Data Types: `single` | `double` | `logical`

Hypothesized value for testing the null hypothesis, specified as a numeric vector with the same number of rows as `H`.

If you specify `H` and `C`, then the output `p` is the p-value for an F-test that H × B = C, where B represents the coefficient vector.

Data Types: `single` | `double`

## Output Arguments

collapse all

p-value for the F-test, returned as a numeric value in the range [0,1].

Value of the test statistic for the F-test, returned as a numeric value.

Numerator degrees of freedom for the F-test, returned as a positive integer. The F-statistic has `r` degrees of freedom in the numerator and `mdl.DFE` degrees of freedom in the denominator.

## Version History

Introduced in R2023a