This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

ClassificationBaggedEnsemble

Package: classreg.learning.classif
Superclasses: ClassificationEnsemble

Classification ensemble grown by resampling

Description

ClassificationBaggedEnsemble combines a set of trained weak learner models and data on which these learners were trained. It can predict ensemble response for new data by aggregating predictions from its weak learners.

Construction

Create a bagged classification ensemble object using fitcensemble. Set the name-value pair argument 'Method' of fitcensemble to 'Bag' to use bootstrap aggregation (bagging, for example, random forest).

Properties

BinEdges

Bin edges for numeric predictors, specified as a cell array of p numeric vectors, where p is the number of predictors. Each vector includes the bin edges for a numeric predictor. The element in the cell array for a categorical predictor is empty because the software does not bin categorical predictors.

The software bins numeric predictors only if you specify the 'NumBins' name-value pair argument as a positive integer scalar when training a model with tree learners. The BinEdges property is empty if the 'NumBins' value is empty (default).

You can reproduce the binned predictor data Xbinned by using the BinEdges property of the trained model mdl.

X = mdl.X; % Predictor data
Xbinned = zeros(size(X));
edges = mdl.BinEdges;
% Find indices of binned predictors.
idxNumeric = find(~cellfun(@isempty,edges));
if iscolumn(idxNumeric)
    idxNumeric = idxNumeric';
end
for j = idxNumeric 
    x = X(:,j);
    % Convert x to array if x is a table.
    if istable(x) 
        x = table2array(x);
    end
    % Group x into bins by using the discretize function.
    xbinned = discretize(x,[-inf; edges{j}; inf]); 
    Xbinned(:,j) = xbinned;
end
Xbinned contains the bin indices, ranging from 1 to the number of bins, for numeric predictors. Xbinned values are 0 for categorical predictors. If X contains NaNs, then the corresponding Xbinned values are NaNs.

CategoricalPredictors

Categorical predictor indices, specified as a vector of positive integers. CategoricalPredictors contains index values corresponding to the columns of the predictor data that contain categorical predictors. If none of the predictors are categorical, then this property is empty ([]).

ClassNames

List of the elements in Y with duplicates removed. ClassNames can be a numeric vector, categorical vector, logical vector, character array, or cell array of character vectors. ClassNames has the same data type as the data in the argument Y. (The software treats string arrays as cell arrays of character vectors.)

CombineWeights

Character vector describing how ens combines weak learner weights, either 'WeightedSum' or 'WeightedAverage'.

ExpandedPredictorNames

Expanded predictor names, stored as a cell array of character vectors.

If the model uses encoding for categorical variables, then ExpandedPredictorNames includes the names that describe the expanded variables. Otherwise, ExpandedPredictorNames is the same as PredictorNames.

FitInfo

Numeric array of fit information. The FitInfoDescription property describes the content of this array.

FitInfoDescription

Character vector describing the meaning of the FitInfo array.

FResample

Numeric scalar between 0 and 1. FResample is the fraction of training data fitcensemble resampled at random for every weak learner when constructing the ensemble.

HyperparameterOptimizationResults

Description of the cross-validation optimization of hyperparameters, stored as a BayesianOptimization object or a table of hyperparameters and associated values. Nonempty when the OptimizeHyperparameters name-value pair is nonempty at creation. Value depends on the setting of the HyperparameterOptimizationOptions name-value pair at creation:

  • 'bayesopt' (default) — Object of class BayesianOptimization

  • 'gridsearch' or 'randomsearch' — Table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst)

Method

Character vector describing the method that creates ens.

ModelParameters

Parameters used in training ens.

NumTrained

Number of trained weak learners in ens, a scalar.

PredictorNames

Cell array of names for the predictor variables, in the order in which they appear in X.

ReasonForTermination

Character vector describing the reason fitcensemble stopped adding weak learners to the ensemble.

Replace

Logical value indicating if the ensemble was trained with replacement (true) or without replacement (false).

ResponseName

Character vector with the name of the response variable Y.

ScoreTransform

Function handle for transforming scores, or character vector representing a built-in transformation function. 'none' means no transformation; equivalently, 'none' means @(x)x. For a list of built-in transformation functions and the syntax of custom transformation functions, see fitctree.

Add or change a ScoreTransform function using dot notation:

ens.ScoreTransform = 'function'

or

ens.ScoreTransform = @function

Trained

Trained learners, a cell array of compact classification models.

TrainedWeights

Numeric vector of trained weights for the weak learners in ens. TrainedWeights has T elements, where T is the number of weak learners in learners.

UseObsForLearner

Logical matrix of size N-by-NumTrained, where N is the number of observations in the training data and NumTrained is the number of trained weak learners. UseObsForLearner(I,J) is true if observation I was used for training learner J, and is false otherwise.

W

Scaled weights, a vector with length n, the number of rows in X. The sum of the elements of W is 1.

X

Matrix or table of predictor values that trained the ensemble. Each column of X represents one variable, and each row represents one observation.

Y

A categorical array, cell array of character vectors, character array, logical vector, or a numeric vector with the same number of rows as X. Each row of Y represents the classification of the corresponding row of X.

Methods

oobEdgeOut-of-bag classification edge
oobLossOut-of-bag classification error
oobMarginOut-of-bag classification margins
oobPermutedPredictorImportancePredictor importance estimates by permutation of out-of-bag predictor observations for random forest of classification trees
oobPredictPredict out-of-bag response of ensemble

Inherited Methods

compactCompact classification ensemble
crossvalCross validate ensemble
resubEdgeClassification edge by resubstitution
resubLossClassification error by resubstitution
resubMarginClassification margins by resubstitution
resubPredictClassify observations in ensemble of classification models
resumeResume training ensemble
edgeClassification edge
lossClassification error
marginClassification margins
predictClassify observations using ensemble of classification models
predictorImportanceEstimates of predictor importance
removeLearnersRemove members of compact classification ensemble

Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects (MATLAB).

Examples

collapse all

Load the ionosphere data set.

load ionosphere

You can train a bagged ensemble of 100 classification trees using all measurements.

Mdl = fitcensemble(X,Y,'Method','Bag')

fitcensemble uses a default template tree object templateTree() as a weak learner when 'Method' is 'Bag'. In this example, for reproducibility, specify 'Reproducible',true when you create a tree template object, and then use the object as a weak learner.

rng('default') % For reproducibility
t = templateTree('Reproducible',true); % For reproducibiliy of random predictor selections
Mdl = fitcensemble(X,Y,'Method','Bag','Learners',t)
Mdl = 
  classreg.learning.classif.ClassificationBaggedEnsemble
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
          NumObservations: 351
               NumTrained: 100
                   Method: 'Bag'
             LearnerNames: {'Tree'}
     ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
                  FitInfo: []
       FitInfoDescription: 'None'
                FResample: 1
                  Replace: 1
         UseObsForLearner: [351x100 logical]


  Properties, Methods

Mdl is a ClassificationBaggedEnsemble model object.

Mdl.Trained is the property that stores a 100-by-1 cell vector of the trained classification trees (CompactClassificationTree model objects) that compose the ensemble.

Plot a graph of the first trained classification tree.

view(Mdl.Trained{1},'Mode','graph')

By default, fitcensemble grows deep decision trees for bagged ensembles.

Estimate the in-sample misclassification rate.

L = resubLoss(Mdl)
L = 0

L is 0, which indicates that Mdl is perfect at classifying the training data.

Tips

For a bagged ensemble of classification trees, the Trained property of ens stores a cell vector of ens.NumTrained CompactClassificationTree model objects. For a textual or graphical display of tree t in the cell vector, enter

view(ens.Trained{t})

Extended Capabilities

Introduced in R2011a