Predict labels using classification tree

`label = predict(Mdl,X)`

`label = predict(Mdl,X,Name,Value)`

```
[label,score,node,cnum]
= predict(___)
```

uses
additional options specified by one or more `label`

= predict(`Mdl`

,`X`

,`Name,Value`

)`Name,Value`

pair
arguments. For example, you can specify to prune `Mdl`

to
a particular level before predicting labels.

`[`

uses any of the input argument
in the previous syntaxes and additionally returns:`label`

,`score`

,`node`

,`cnum`

]
= predict(___)

A matrix of classification scores (

`score`

) indicating the likelihood that a label comes from a particular class. For classification trees, scores are posterior probabilities. For each observation in`X`

, the predicted class label corresponds to the minimum expected misclassification cost among all classes.A vector of predicted node numbers for the classification (

`node`

).A vector of predicted class number for the classification (

`cnum`

).

`Mdl`

— Trained classification tree`ClassificationTree`

model object | `CompactClassificationTree`

model objectTrained classification tree, specified as a `ClassificationTree`

or `CompactClassificationTree`

model
object. That is, `Mdl`

is a trained classification
model returned by `fitctree`

or `compact`

.

`X`

— Predictor data to be classifiednumeric matrix | table

Predictor data to be classified, specified as a numeric matrix or table.

Each row of `X`

corresponds to one observation,
and each column corresponds to one variable.

For a numeric matrix:

The variables making up the columns of

`X`

must have the same order as the predictor variables that trained`Mdl`

.If you trained

`Mdl`

using a table (for example,`Tbl`

), then`X`

can be a numeric matrix if`Tbl`

contains all numeric predictor variables. To treat numeric predictors in`Tbl`

as categorical during training, identify categorical predictors using the`CategoricalPredictors`

name-value pair argument of`fitctree`

. If`Tbl`

contains heterogeneous predictor variables (for example, numeric and categorical data types) and`X`

is a numeric matrix, then`predict`

throws an error.

For a table:

`predict`

does not support multi-column variables and cell arrays other than cell arrays of character vectors.If you trained

`Mdl`

using a table (for example,`Tbl`

), then all predictor variables in`X`

must have the same variable names and data types as those that trained`Mdl`

(stored in`Mdl.PredictorNames`

). However, the column order of`X`

does not need to correspond to the column order of`Tbl`

.`Tbl`

and`X`

can contain additional variables (response variables, observation weights, etc.), but`predict`

ignores them.If you trained

`Mdl`

using a numeric matrix, then the predictor names in`Mdl.PredictorNames`

and corresponding predictor variable names in`X`

must be the same. To specify predictor names during training, see the`PredictorNames`

name-value pair argument of`fitctree`

. All predictor variables in`X`

must be numeric vectors.`X`

can contain additional variables (response variables, observation weights, etc.), but`predict`

ignores them.

**Data Types: **`table`

| `double`

| `single`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'Subtrees'`

— Pruning level0 (default) | vector of nonnegative integers |

`'all'`

Pruning level, specified as the comma-separated pair consisting
of `'Subtrees'`

and a vector of nonnegative integers
in ascending order or `'all'`

.

If you specify a vector, then all elements must be at least `0`

and
at most `max(Mdl.PruneList)`

. `0`

indicates
the full, unpruned tree and `max(Mdl.PruneList)`

indicates
the completely pruned tree (i.e., just the root node).

If you specify `'all'`

, then `predict`

operates
on all subtrees (i.e., the entire pruning sequence). This specification
is equivalent to using `0:max(Mdl.PruneList)`

.

`predict`

prunes `Mdl`

to
each level indicated in `Subtrees`

, and then estimates
the corresponding output arguments. The size of `Subtrees`

determines
the size of some output arguments.

To invoke `Subtrees`

, the properties `PruneList`

and `PruneAlpha`

of `Mdl`

must
be nonempty. In other words, grow `Mdl`

by setting `'Prune','on'`

,
or by pruning `Mdl`

using `prune`

.

**Example: **`'Subtrees','all'`

**Data Types: **`single`

| `double`

| `char`

| `string`

`label`

— Predicted class labelsvector | array

Predicted
class labels, returned as a vector or array. Each entry of `label`

corresponds
to the class with minimal expected cost for the corresponding row
of `X`

.

Suppose `Subtrees`

is a numeric vector containing `T`

elements (for `'all'`

, see `Subtrees`

),
and `X`

has `N`

rows.

If the response data type is

`char`

and:`T`

= 1, then`label`

is a character matrix containing`N`

rows. Each row contains the predicted label produced by subtree`Subtrees`

.`T`

> 1, then`label`

is an`N`

-by-`T`

cell array.

Otherwise,

`label`

is an`N`

-by-`T`

array having the same data type as the response. (The software treats string arrays as cell arrays of character vectors.)

In the latter two cases, column * j*
of

`label`

contains the vector of predicted labels produced
by subtree `Subtrees(``j`

)

.`score`

— Posterior probabilitiesnumeric matrix

Posterior probabilities, returned as a numeric matrix of size `N`

-by-`K`

,
where `N`

is the number of observations (rows) in `X`

,
and `K`

is the number of classes (in `Mdl.ClassNames`

). `score(i,j)`

is
the posterior probability that row `i`

of `X`

is
of class `j`

.

If `Subtrees`

has `T`

elements,
and `X`

has `N`

rows, then `score`

is
an `N`

-by-`K`

-by-`T`

array,
and `node`

and `cnum`

are `N`

-by-`T`

matrices.

`cnum`

— Class numbersnumeric vector

Class numbers corresponding to the predicted `labels`

,
returned as a numeric vector. Each entry of `cnum`

corresponds
to a predicted class number for the corresponding row of `X`

.

Examine predictions for a few rows in a data set left out of training.

Load Fisher's iris data set.

`load fisheriris`

Partition the data into training (50%) and validation (50%) sets.

n = size(meas,1); rng(1) % For reproducibility idxTrn = false(n,1); idxTrn(randsample(n,round(0.5*n))) = true; % Training set logical indices idxVal = idxTrn == false; % Validation set logical indices

Grow a classification tree using the training set.

Mdl = fitctree(meas(idxTrn,:),species(idxTrn));

Predict labels for the validation data. Count the number of misclassified observations.

```
label = predict(Mdl,meas(idxVal,:));
label(randsample(numel(label),5)) % Display several predicted labels
```

`ans = `*5x1 cell array*
{'setosa' }
{'setosa' }
{'setosa' }
{'virginica' }
{'versicolor'}

numMisclass = sum(~strcmp(label,species(idxVal)))

numMisclass = 3

The software misclassifies three out-of-sample observations.

Load Fisher's iris data set.

`load fisheriris`

Partition the data into training (50%) and validation (50%) sets.

n = size(meas,1); rng(1) % For reproducibility idxTrn = false(n,1); idxTrn(randsample(n,round(0.5*n))) = true; % Training set logical indices idxVal = idxTrn == false; % Validation set logical indices

Grow a classification tree using the training set, and then view it.

Mdl = fitctree(meas(idxTrn,:),species(idxTrn)); view(Mdl,'Mode','graph')

The resulting tree has four levels.

Estimate posterior probabilities for the test set using subtrees pruned to levels 1 and 3.

```
[~,Posterior] = predict(Mdl,meas(idxVal,:),'SubTrees',[1 3]);
Mdl.ClassNames
```

`ans = `*3x1 cell array*
{'setosa' }
{'versicolor'}
{'virginica' }

Posterior(randsample(size(Posterior,1),5),:,:),... % Display several posterior probabilities

ans = ans(:,:,1) = 1.0000 0 0 1.0000 0 0 1.0000 0 0 0 0 1.0000 0 0.8571 0.1429 ans(:,:,2) = 0.3733 0.3200 0.3067 0.3733 0.3200 0.3067 0.3733 0.3200 0.3067 0.3733 0.3200 0.3067 0.3733 0.3200 0.3067

The elements of `Posterior`

are class posterior probabilities:

Rows correspond to observations in the validation set.

Columns correspond to the classes as listed in

`Mdl.ClassNames`

.Pages correspond to the subtrees.

The subtree pruned to level 1 is more sure of its predictions than the subtree pruned to level 3 (i.e., the root node).

`predict`

classifies by minimizing the expected
classification cost:

$$\widehat{y}=\underset{y=1,\mathrm{...},K}{\mathrm{arg}\mathrm{min}}{\displaystyle \sum _{j=1}^{K}\widehat{P}\left(j|x\right)C\left(y|j\right)},$$

where

$$\widehat{y}$$ is the predicted classification.

*K*is the number of classes.$$\widehat{P}\left(j|x\right)$$ is the posterior probability of class

*j*for observation*x*.$$C\left(y|j\right)$$ is the cost of classifying an observation as

*y*when its true class is*j*.

For trees, the *score* of a classification
of a leaf node is the posterior probability of the classification
at that node. The posterior probability of the classification at a
node is the number of training sequences that lead to that node with
the classification, divided by the number of training sequences that
lead to that node.

For example, consider classifying a predictor `X`

as `true`

when `X`

< `0.15`

or `X`

> `0.95`

, and `X`

is
false otherwise.

Generate 100 random points and classify them:

rng(0,'twister') % for reproducibility X = rand(100,1); Y = (abs(X - .55) > .4); tree = fitctree(X,Y); view(tree,'Mode','Graph')

Prune the tree:

tree1 = prune(tree,'Level',1); view(tree1,'Mode','Graph')

The pruned tree correctly classifies observations that are less
than 0.15 as `true`

. It also correctly classifies
observations from .15 to .94 as `false`

. However,
it incorrectly classifies observations that are greater than .94 as `false`

.
Therefore, the score for observations that are greater than .15 should
be about .05/.85=.06 for `true`

, and about .8/.85=.94
for `false`

.

Compute the prediction scores for the first 10 rows of `X`

:

[~,score] = predict(tree1,X(1:10)); [score X(1:10,:)]

`ans = `*10×3*
0.9059 0.0941 0.8147
0.9059 0.0941 0.9058
0 1.0000 0.1270
0.9059 0.0941 0.9134
0.9059 0.0941 0.6324
0 1.0000 0.0975
0.9059 0.0941 0.2785
0.9059 0.0941 0.5469
0.9059 0.0941 0.9575
0.9059 0.0941 0.9649

Indeed, every value of `X`

(the right-most
column) that is less than 0.15 has associated scores (the left and
center columns) of `0`

and `1`

,
while the other values of `X`

have associated scores
of `0.91`

and `0.09`

. The difference
(score `0.09`

instead of the expected `.06`

)
is due to a statistical fluctuation: there are `8`

observations
in `X`

in the range `(.95,1)`

instead
of the expected `5`

observations.

There are two costs associated with classification: the true misclassification cost per class, and the expected misclassification cost per observation.

You can set the true misclassification cost per class in the `Cost`

name-value
pair when you create the classifier using the `fitctree`

method. `Cost(i,j)`

is
the cost of classifying an observation into class `j`

if
its true class is `i`

. By default, `Cost(i,j)=1`

if `i~=j`

,
and `Cost(i,j)=0`

if `i=j`

. In other
words, the cost is `0`

for correct classification,
and `1`

for incorrect classification.

There are two costs associated with classification: the true misclassification cost per class, and the expected misclassification cost per observation.

Suppose you have `Nobs`

observations that you
want to classify with a trained classifier. Suppose you have `K`

classes.
You place the observations into a matrix `Xnew`

with
one observation per row.

The expected cost matrix `CE`

has size `Nobs`

-by-`K`

.
Each row of `CE`

contains the expected (average)
cost of classifying the observation into each of the `K`

classes. `CE(n,k)`

is

$$\sum _{i=1}^{K}\widehat{P}\left(i|Xnew(n)\right)C\left(k|i\right)},$$

where

*K*is the number of classes.$$\widehat{P}\left(i|Xnew(n)\right)$$ is the posterior probability of class

*i*for observation*Xnew*(*n*).$$C\left(k|i\right)$$ is the true misclassification cost of classifying an observation as

*k*when its true class is*i*.

The *predictive measure of association* is
a value that indicates the similarity between decision rules that
split observations. Among all possible decision splits that are compared
to the optimal split (found by growing the tree), the best surrogate decision
split yields the maximum predictive measure of association.
The second-best surrogate split has the second-largest predictive
measure of association.

Suppose *x _{j}* and

$${\lambda}_{jk}=\frac{\text{min}\left({P}_{L},{P}_{R}\right)-\left(1-{P}_{{L}_{j}{L}_{k}}-{P}_{{R}_{j}{R}_{k}}\right)}{\text{min}\left({P}_{L},{P}_{R}\right)}.$$

*P*is the proportion of observations in node_{L}*t*, such that*x*<_{j}*u*. The subscript*L*stands for the left child of node*t*.*P*is the proportion of observations in node_{R}*t*, such that*x*≥_{j}*u*. The subscript*R*stands for the right child of node*t*.$${P}_{{L}_{j}{L}_{k}}$$ is the proportion of observations at node

*t*, such that*x*<_{j}*u*and*x*<_{k}*v*.$${P}_{{R}_{j}{R}_{k}}$$ is the proportion of observations at node

*t*, such that*x*≥_{j}*u*and*x*≥_{k}*v*.Observations with missing values for

*x*or_{j}*x*do not contribute to the proportion calculations._{k}

*λ _{jk}* is a value
in (–∞,1]. If

`predict`

generates predictions by following
the branches of `Mdl`

until it reaches a leaf node
or a missing value. If `predict`

reaches a leaf node,
it returns the classification of that node.

If `predict`

reaches a node with a missing value
for a predictor, its behavior depends on the setting of the `Surrogate`

name-value
pair when `fitctree`

constructs `Mdl`

.

(default) —`Surrogate`

=`'off'`

`predict`

returns the label with the largest number of training samples that reach the node.—`Surrogate`

=`'on'`

`predict`

uses the best surrogate split at the node. If all surrogate split variables with positive*predictive measure of association*are missing,`predict`

returns the label with the largest number of training samples that reach the node. For a definition, see Predictive Measure of Association.

Calculate with arrays that have more rows than fit in memory.

This function fully supports tall arrays. For more information, see Tall Arrays (MATLAB).

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Use

`saveCompactModel`

,`loadCompactModel`

, and`codegen`

to generate code for the`predict`

function. Save a trained model by using`saveCompactModel`

. Define an entry-point function that loads the saved model by using`loadCompactModel`

and calls the`predict`

function. Then use`codegen`

to generate code for the entry-point function.This table contains notes about the arguments of

`predict`

. Arguments not included in this table are fully supported.Argument Notes and Limitations `Mdl`

For the usage notes and limitations of the model object, see Code Generation of the

`CompactClassificationTree`

object.`X`

Must be a single-precision or double-precision matrix and can be variable-size. However, the number of columns in

`X`

must be`numel(Mdl.PredictorNames)`

.Rows and columns must correspond to observations and predictors, respectively.

`label`

If the response data type is `char`

and`codegen`

cannot determine that the value of`Subtrees`

is a scalar, then`label`

is a cell array of character vectors.Name-value pair arguments Names in name-value pair arguments must be compile-time constants. For example, to allow user-defined pruning levels in the generated code, include

`{coder.Constant('Subtrees'),coder.typeof(0,[1,n],[0,1])}`

in the`-args`

value of`codegen`

, where`n`

is`max(Mdl.PruneList)`

.

For more information, see Introduction to Code Generation.

`ClassificationTree`

| `CompactClassificationTree`

| `compact`

| `edge`

| `fitctree`

| `loss`

| `margin`

| `prune`

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)