Main Content

oobPredict

Ensemble predictions for out-of-bag observations

Syntax

Y = oobPredict(B)
Y = oobPredict(B,Name,Value)
[Y,stdevs] = oobPredict(___)
[Y,scores] = oobPredict(___)
[Y,scores,stdevs] = oobPredict(___)

Description

Y = oobPredict(B) computes predicted responses using the trained bagger B for out-of-bag observations in the training data. The output has one prediction for each observation in the training data. The returned Y is a cell array of character vectors for classification and a numeric array for regression.

Y = oobPredict(B,Name,Value) specifies additional options using one or both name-value pair arguments:

  • 'Trees' — Array of tree indices to use for computation of responses. The default is 'all'.

  • 'TreeWeights' — Array of NTrees weights for weighting votes from the specified trees, where NTrees is the number of trees in the ensemble.

For regression, [Y,stdevs] = oobPredict(___) also returns standard deviations of the computed responses over the ensemble of the grown trees using any of the input argument combinations in previous syntaxes.

For classification, [Y,scores] = oobPredict(___) also returns scores for all classes. scores is a matrix with one row per observation and one column per class. For each out-of-bag observation and each class, the score generated by each tree is the probability of the observation originating from the class, computed as the fraction of observations of the class in a tree leaf. oobPredict averages these scores over all trees in the ensemble.

[Y,scores,stdevs] = oobPredict(___) also returns standard deviations of the computed scores for classification. stdevs is a matrix with one row per observation and one column per class, with standard deviations taken over the ensemble of the grown trees.

Algorithms

oobPredict and predict similarly predict classes and responses.

  • In regression problems:

    • For each observation that is out of bag for at least one tree, oobPredict composes the weighted mean by selecting responses of trees in which the observation is out of bag. For this computation, the 'TreeWeights' name-value pair argument specifies the weights.

    • For each observation that is in bag for all trees, the predicted response is the weighted mean of all of the training responses. For this computation, the W property of the TreeBagger model (i.e., the observation weights) specify the weights.

  • In classification problems:

    • For each observation that is out of bag for at least one tree, oobPredict composes the weighted mean of the class posterior probabilities by selecting the trees in which the observation is out of bag. Consequently, the predicted class is the class corresponding to the largest weighted mean. For this computation, the 'TreeWeights' name-value pair argument specifies the weights.

    • For each observation that is in bag for all trees, the predicted class is the weighted, most popular class over all training responses. For this computation, the W property of the TreeBagger model (i.e., the observation weights) specify the weights. If there are multiple most popular classes, oobPredict considers the one listed first in the ClassNames property of the TreeBagger model the most popular.