Main Content

predict

Predict responses using regression tree

Description

example

Yfit = predict(Mdl,X) returns a vector of predicted responses for the predictor data in the table or matrix X, based on the full or compact regression tree Mdl.

Yfit = predict(Mdl,X,Subtrees=subtrees) also prunes Mdl to the level specified by subtrees, before predicting responses.

Before R2021a, use the equivalent syntax Yfit = predict(Mdl,X,"Subtrees",subtrees).

[Yfit,node] = predict(___) also returns a vector of predicted node numbers for the responses, using any of the input arguments in the previous syntaxes.

Examples

collapse all

Load the carsmall data set. Consider Displacement, Horsepower, and Weight as predictors of the response MPG.

load carsmall
X = [Displacement Horsepower Weight];

Grow a regression tree using the entire data set.

Mdl = fitrtree(X,MPG);

Predict the MPG for a car with 200 cubic inch engine displacement, 150 horsepower, and that weighs 3000 lbs.

X0 = [200 150 3000];
MPG0 = predict(Mdl,X0)
MPG0 = 21.9375

The regression tree predicts the car's efficiency to be 21.94 mpg.

Input Arguments

collapse all

Trained regression tree, specified as a RegressionTree object created by the fitrtree function or a CompactRegressionTree object created by the compact function.

Predictor data to be classified, specified as a numeric matrix or table.

Each row of X corresponds to one observation, and each column corresponds to one variable.

  • For a numeric matrix:

    • The variables making up the columns of X must have the same order as the predictor variables that trained Mdl.

    • If you trained Mdl using a table (for example, Tbl), then X can be a numeric matrix if Tbl contains all numeric predictor variables. To treat numeric predictors in Tbl as categorical during training, identify categorical predictors using the CategoricalPredictors name-value pair argument of fitrtree. If Tbl contains heterogeneous predictor variables (for example, numeric and categorical data types) and X is a numeric matrix, then predict throws an error.

  • For a table:

    • predict does not support multicolumn variables or cell arrays other than cell arrays of character vectors.

    • If you trained Mdl using a table (for example, Tbl), then all predictor variables in X must have the same variable names and data types as those that trained Mdl (stored in Mdl.PredictorNames). However, the column order of X does not need to correspond to the column order of Tbl. Tbl and X can contain additional variables (response variables, observation weights, etc.), but predict ignores them.

    • If you trained Mdl using a numeric matrix, then the predictor names in Mdl.PredictorNames and corresponding predictor variable names in X must be the same. To specify predictor names during training, see the PredictorNames name-value pair argument of fitrtree. All predictor variables in X must be numeric vectors. X can contain additional variables (response variables, observation weights, etc.), but predict ignores them.

Data Types: table | double | single

Pruning level, specified as a vector of nonnegative integers in ascending order or "all".

If you specify a vector, then all elements must be at least 0 and at most max(Mdl.PruneList). 0 indicates the full, unpruned tree and max(Mdl.PruneList) indicates the completely pruned tree (in other words, just the root node).

If you specify "all", then predict operates on all subtrees (in other words, the entire pruning sequence). This specification is equivalent to using 0:max(Mdl.PruneList).

predict prunes Mdl to each level indicated in Subtrees, and then estimates the corresponding output arguments. The size of Subtrees determines the size of some output arguments.

To invoke Subtrees, the properties PruneList and PruneAlpha of Mdl must be nonempty. In other words, grow Mdl by setting Prune="on", or by pruning Mdl using prune.

Data Types: single | double | char | string

Output Arguments

collapse all

Predicted response values, returned as a numeric column vector with the same number of rows as X. Each row of Yfit gives the predicted response to the corresponding row of X, based on the Mdl.

Node numbers for the predictions, specified as a numeric vector. Each entry corresponds to the predicted leaf node in Mdl for the corresponding row of X.

Alternative Functionality

Simulink Block

To integrate the prediction of a regression tree model into Simulink®, you can use the RegressionTree Predict block in the Statistics and Machine Learning Toolbox™ library or a MATLAB® Function block with the predict function. For examples, see Predict Responses Using RegressionTree Predict Block and Predict Class Labels Using MATLAB Function Block.

When deciding which approach to use, consider the following:

  • If you use the Statistics and Machine Learning Toolbox library block, you can use the Fixed-Point Tool (Fixed-Point Designer) to convert a floating-point model to fixed point.

  • Support for variable-size arrays must be enabled for a MATLAB Function block with the predict function.

  • If you use a MATLAB Function block, you can use MATLAB functions for preprocessing or post-processing before or after predictions in the same MATLAB Function block.

Extended Capabilities

Version History

Introduced in R2011a