predict

Predict responses of multinomial regression model

Since R2023a

collapse all in page

Syntax

Ypred = predict(mdl,XNew)

[Ypred,probs] = predict(mdl,XNew)

[Ypred,probs,lower,upper] = predict(mdl,XNew)

[___] = predict(___,Name=Value)

Description

Ypred = predict(mdl,XNew) returns predicted class labels for the predictor data XNew and MultinomialRegression model object mdl.

example

[Ypred,probs] = predict(mdl,XNew) also returns probability estimates for the response categories.

[Ypred,probs,lower,upper] = predict(mdl,XNew) also returns lower and upper confidence interval bounds for the responses Ypred.

[___] = predict(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify the type of probability for the probability estimates returned in probs.

Examples

collapse all

Predict Response Categories

Open Live Script

Load the fisheriris sample data set.

load fisheriris

The column vector species contains three iris flowers species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

Divide the species and measurement data into training and test data by using the cvpartition function. Get the indices of the training data rows by using the training function.

n = length(species);
partition = cvpartition(n,'Holdout',0.05);
idx_train = training(partition);

Create training data by using the indices of the training data rows to create a matrix of measurements and a vector of species labels.

meastrain = meas(idx_train,:);
speciestrain = species(idx_train,:);

Fit a multinomial regression model using the training data.

mdl = fitmnr(meastrain,speciestrain)

mdl = 
Multinomial regression with nominal responses

                               Value       SE       tStat        pValue  
                              _______    ______    ________    __________

    (Intercept_setosa)         86.297    12.541       6.881    5.9436e-12
    x1_setosa                 -1.0653    3.5795    -0.29761         0.766
    x2_setosa                  23.849    3.1238      7.6347    2.2637e-14
    x3_setosa                 -27.273    3.5009     -7.7903    6.6846e-15
    x4_setosa                 -59.644    7.0214     -8.4947    1.9852e-17
    (Intercept_versicolor)     42.637    5.2214      8.1659    3.1906e-16
    x1_versicolor              2.4652    1.1263      2.1887      0.028619
    x2_versicolor              6.6808     1.474      4.5325     5.829e-06
    x3_versicolor             -9.4292    1.2946     -7.2837     3.248e-13
    x4_versicolor             -18.286    2.0833     -8.7775     1.671e-18


143 observations, 276 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 302.0378, p-value = 1.5168e-60

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data. The table output shows coefficient statistics for each predictor in meas. By default, fitmnr uses virginica as the reference category.

Get the indices of the test data rows by using the test function. Create test data by using the indices of the test data rows to create a matrix of measurements and a vector of species labels.

idx_test = test(partition);
meastest = meas(idx_test,:);
speciestest = species(idx_test,:);

Predict the iris species for the measurements in meastest.

speciespredict = predict(mdl,meastest)

speciespredict = 7×1 cell
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'versicolor'}
    {'versicolor'}

Compare the predictions in speciespredict with the category names in speciestest.

speciestest

speciestest = 7×1 cell
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'setosa'    }
    {'versicolor'}
    {'versicolor'}

The output shows that the model accurately predicts the iris species for the measurements in meastest.

Calculate Confidence Intervals for Cumulative Category Probabilities

Open Live Script

Load the carbig sample data set.

load carbig

The variables Acceleration and Displacement contain data for car acceleration and displacement, respectively. The variable Cylinders contains data for the number of cylinders in each car engine.

Create a table from the car data variables using the table function.

tbl = table(Acceleration,Displacement,Cylinders,VariableNames=["Acceleration","Displacement","Cylinders"])

tbl=406×3 table
    Acceleration    Displacement    Cylinders
    ____________    ____________    _________

          12            307             8    
        11.5            350             8    
          11            318             8    
          12            304             8    
        10.5            302             8    
          10            429             8    
           9            454             8    
         8.5            440             8    
          10            455             8    
         8.5            390             8    
        17.5            133             4    
        11.5            350             8    
          11            351             8    
        10.5            383             8    
          11            360             8    
          10            383             8    
      ⋮

The Cylinders data has an inherent ordering. Fit an ordinal multinomial regression model using Acceleration and Displacement as predictor variables and Cylinders as the response.

mdl = fitmnr(tbl,"Cylinders",ModelType="ordinal");

mdl is a multinomial regression model object that contains the results of fitting an ordinal multinomial regression model to the data.

Predict the response category, cumulative category probabilities, and 99% confidence interval bounds for a car with an acceleration of 16 and an engine displacement of 80.

[cylinderspredict,cumprobs,lower,upper] = predict(mdl,[16 80],Alpha=0.01,ProbabilityType="cumulative")

cylinderspredict = 
4

cumprobs = 1×4

    0.0792    1.0000    1.0000    1.0000

lower = 1×4

    0.0787    1.0000    1.0000    1.0000

upper = 1×4

    0.0798    1.0000    1.0000    1.0000

The output shows that the predicted response category is 4. The vector cumprobs shows the cumulative probabilities for each category in Cylinders. To view the category probabilities on which the prediction is based, calculate the category probabilities.

[~,catprobs] = predict(mdl,[16 80])

catprobs = 1×5

    0.0792    0.9208    0.0000    0.0000    0.0000

The second value in the vector catprobs has the highest probability. Display an ordered list of the categories in Cylinders.

mdl.ClassNames

The output shows that the second category corresponds to cars with four cylinders. Therefore, the category with the highest category probability is 4.

Input Arguments

collapse all

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

`XNew` — New predictor input values
table | n-by-p matrix

New predictor input values, specified as a table or an n-by-p matrix, where n is the number of observations to predict, and p is the number of predictor variables used to fit mdl.

If XNew is a table, it must contain all the names of the predictors used to fit mdl. You can find the predictor names in the mdl.PredictorNames property.
If XNew is a matrix, it must have the same number of columns as the number of estimated coefficients. You can find the number of estimated coefficients in the mdl.NumPredictors property. You can specify XNew as a matrix only when all names in mdl.PredictorNames refer to numeric predictors.

Example: predict(mdl,[6.2 3.4; 5.9 3.0]) evaluates the two-predictor model mdl at the points p1 = [6.2 3.4] and p2 = [5.9 3.0].

Data Types: single | double | table

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: [Ypred,probs,lower,upper] = predict(model,X,Alpha=0.01,ProbabilityType="cumulative") specifies a 99% confidence level for the probability estimates and their type as cumulative.

`Alpha` — Significance level for probability estimates
`0.05` (default) | scalar value in the range (0,1)

Significance level for the probability estimates, specified as a scalar value in the range (0,1). The confidence level of the confidence intervals is 100(1 − α)%. The default value for Alpha is 0.05, which returns 95% confidence intervals for the estimates.

Example: Alpha=0.01

Data Types: single | double

`ProbabilityType` — Type of probability estimates
`"category"` (default) | `"cumulative"` | `"conditional"`

Type of probability estimates to return in probs, specified as one of the following options.

Option	Description
`"category"` (default)	Calculate a distinct probability for each response category.
`"cumulative"`	Calculate a cumulative probability for each response category.
`"conditional"`	Calculate a conditional probability for each response category.

Example: ProbabilityType="conditional"

Data Types: char | string

Output Arguments

collapse all

`Ypred` — Predicted response categories
categorical array | character array | logical vector | numeric vector | cell array of character vectors

Predicted response categories, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors. Ypred has the same data type as mdl.ClassNames.

`probs` — Probability estimates
numeric matrix

Probability estimates for the response categories, returned as a numeric matrix. Each column of probs corresponds to the entry at the same index in mdl.ClassNames.

`upper` — Upper confidence interval bound
numeric matrix

Upper confidence interval bound for the probability estimates in probs, returned as a numeric matrix.

`lower` — Lower confidence interval bound
numeric matrix

Lower confidence interval bound for the probability estimates in probs, returned as a numeric matrix.

Alternative Functionality

feval returns the same predictions as predict. The feval function can take multiple input arguments, with one input for each predictor variable. Note that the feval function does not give confidence intervals on predictions.
random returns predictions with added noise.

Version History

Introduced in R2023a

predict

Syntax

Description

Examples

Predict Response Categories

Calculate Confidence Intervals for Cumulative Category Probabilities

Input Arguments

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

`XNew` — New predictor input values
table | n-by-p matrix

Name-Value Arguments

`Alpha` — Significance level for probability estimates
`0.05` (default) | scalar value in the range (0,1)

`ProbabilityType` — Type of probability estimates
`"category"` (default) | `"cumulative"` | `"conditional"`

Output Arguments

`Ypred` — Predicted response categories
categorical array | character array | logical vector | numeric vector | cell array of character vectors

`probs` — Probability estimates
numeric matrix

`upper` — Upper confidence interval bound
numeric matrix

`lower` — Lower confidence interval bound
numeric matrix

Alternative Functionality

Version History

See Also

Topics

predict

Syntax

Description

Examples

Predict Response Categories

Calculate Confidence Intervals for Cumulative Category Probabilities

Input Arguments

mdl — Multinomial regression model object MultinomialRegression model object

XNew — New predictor input values table | n-by-p matrix

Name-Value Arguments

Alpha — Significance level for probability estimates 0.05 (default) | scalar value in the range (0,1)

ProbabilityType — Type of probability estimates "category" (default) | "cumulative" | "conditional"

Output Arguments

Ypred — Predicted response categories categorical array | character array | logical vector | numeric vector | cell array of character vectors

probs — Probability estimates numeric matrix

upper — Upper confidence interval bound numeric matrix

lower — Lower confidence interval bound numeric matrix

Alternative Functionality

Version History

See Also

Topics

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

`XNew` — New predictor input values
table | n-by-p matrix

`Alpha` — Significance level for probability estimates
`0.05` (default) | scalar value in the range (0,1)

`ProbabilityType` — Type of probability estimates
`"category"` (default) | `"cumulative"` | `"conditional"`

`Ypred` — Predicted response categories
categorical array | character array | logical vector | numeric vector | cell array of character vectors

`probs` — Probability estimates
numeric matrix

`upper` — Upper confidence interval bound
numeric matrix

`lower` — Lower confidence interval bound
numeric matrix