regularize

Find optimal weights for learners in regression ensemble

Syntax

``ens1 = regularize(ens)``
``ens1 = regularize(ens,Name=Value)``

Description

````ens1 = regularize(ens)` finds optimal weights for learners in `ens` using lasso regularization. `regularize` returns a `RegressionEnsemble` model identical to `ens`, but with a populated `Regularization` property.```

example

````ens1 = regularize(ens,Name=Value)` specifies additional options using one or more name-value arguments. For example, you can specify the regularization parameter values, relative tolerance on the regularization level, and maximum number of lasso optimization passes.```

Examples

collapse all

Regularize an ensemble of bagged trees.

Generate sample data.

```rng(10,"twister") % For reproducibility X = rand(2000,20); Y = repmat(-1,2000,1); Y(sum(X(:,1:5),2)>2.5) = 1;```

You can create a bagged classification ensemble of 300 trees from the sample data.

```bag = fitrensemble(X,Y,Method="Bag",NumLearningCycles=300); ```

`fitrensemble` uses a default template tree object `templateTree()` as a weak learner when `Method` is "`Bag"`. In this example, for reproducibility, specify `Reproducible=true` when you create a tree template object, and then use the object as a weak learner.

```t = templateTree(Reproducible=true); % For reproducibiliy of random predictor selections bag = fitrensemble(X,Y,Method="Bag",NumLearningCycles=300,Learners=t);```

Regularize the ensemble of bagged regression trees.

`bag = regularize(bag,Lambda=[0.001 0.1],Verbose=1);`
```Starting lasso regularization for Lambda=0.001. Initial MSE=0.109923. Lasso regularization completed pass 1 for Lambda=0.001 MSE = 0.086912 Relative change in MSE = 0.264768 Number of learners with nonzero weights = 15 Lasso regularization completed pass 2 for Lambda=0.001 MSE = 0.0670602 Relative change in MSE = 0.296029 Number of learners with nonzero weights = 34 Lasso regularization completed pass 3 for Lambda=0.001 MSE = 0.0623931 Relative change in MSE = 0.0748019 Number of learners with nonzero weights = 51 Lasso regularization completed pass 4 for Lambda=0.001 MSE = 0.0605444 Relative change in MSE = 0.0305348 Number of learners with nonzero weights = 70 Lasso regularization completed pass 5 for Lambda=0.001 MSE = 0.0599666 Relative change in MSE = 0.00963517 Number of learners with nonzero weights = 94 Lasso regularization completed pass 6 for Lambda=0.001 MSE = 0.0598835 Relative change in MSE = 0.00138719 Number of learners with nonzero weights = 105 Lasso regularization completed pass 7 for Lambda=0.001 MSE = 0.0598608 Relative change in MSE = 0.000379227 Number of learners with nonzero weights = 113 Lasso regularization completed pass 8 for Lambda=0.001 MSE = 0.0598586 Relative change in MSE = 3.72856e-05 Number of learners with nonzero weights = 115 Lasso regularization completed pass 9 for Lambda=0.001 MSE = 0.0598587 Relative change in MSE = 6.42954e-07 Number of learners with nonzero weights = 115 Lasso regularization completed pass 10 for Lambda=0.001 MSE = 0.0598587 Relative change in MSE = 4.53658e-08 Number of learners with nonzero weights = 115 Completed lasso minimization for Lambda=0.001. Resubstitution MSE changed from 0.109923 to 0.0598587. Number of learners reduced from 300 to 115. Starting lasso regularization for Lambda=0.1. Initial MSE=0.109923. Lasso regularization completed pass 1 for Lambda=0.1 MSE = 0.104917 Relative change in MSE = 0.0477191 Number of learners with nonzero weights = 12 Lasso regularization completed pass 2 for Lambda=0.1 MSE = 0.0851031 Relative change in MSE = 0.232821 Number of learners with nonzero weights = 30 Lasso regularization completed pass 3 for Lambda=0.1 MSE = 0.081245 Relative change in MSE = 0.0474877 Number of learners with nonzero weights = 40 Lasso regularization completed pass 4 for Lambda=0.1 MSE = 0.0796749 Relative change in MSE = 0.0197067 Number of learners with nonzero weights = 53 Lasso regularization completed pass 5 for Lambda=0.1 MSE = 0.0788411 Relative change in MSE = 0.0105746 Number of learners with nonzero weights = 64 Lasso regularization completed pass 6 for Lambda=0.1 MSE = 0.0784959 Relative change in MSE = 0.00439793 Number of learners with nonzero weights = 81 Lasso regularization completed pass 7 for Lambda=0.1 MSE = 0.0784429 Relative change in MSE = 0.000676468 Number of learners with nonzero weights = 88 Lasso regularization completed pass 8 for Lambda=0.1 MSE = 0.078447 Relative change in MSE = 5.24449e-05 Number of learners with nonzero weights = 88 Completed lasso minimization for Lambda=0.1. Resubstitution MSE changed from 0.109923 to 0.078447. Number of learners reduced from 300 to 88. ```

`regularize` reports on its progress.

Inspect the resulting regularization structure.

`bag.Regularization`
```ans = struct with fields: Method: 'Lasso' TrainedWeights: [300x2 double] Lambda: [1.0000e-03 0.1000] ResubstitutionMSE: [0.0599 0.0784] CombineWeights: @classreg.learning.combiner.WeightedSum ```

Check how many learners in the regularized ensemble have positive weights. These are the learners included in a shrunken ensemble.

`sum(bag.Regularization.TrainedWeights > 0)`
```ans = 1×2 115 88 ```

Shrink the ensemble using the weights from `Lambda = 0.1`.

`cmp = shrink(bag,weightcolumn=2)`
```cmp = CompactRegressionEnsemble ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' NumTrained: 88 ```

The compact ensemble contains `87` members, less than 1/3 of the original `300`.

Input Arguments

collapse all

Regression ensemble model, specified as a `RegressionEnsemble` model object trained with `fitrensemble`.

Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `regularize(ens,MaxIter=100,Npass=5)` specifies to allow a maximum of 100 iterations to reach convergence tolerance, and a maximum of 5 passes for lasso optimization.

Regularization parameter values for lasso, specified as a vector of nonnegative scalar values. For the default setting of `Lambda`, `regularize` calculates the smallest value `Lambda_max` for which all optimal weights for learners are `0`. The default value of `Lambda` is a vector including `0` and nine exponentially spaced numbers from `Lambda_max/1000` to `Lambda_max`.

Example: ```Lambda=[0 0.001 0.01 0.1]```

Data Types: `single` | `double`

Maximum number of iterations allowed, specified as a positive integer. If the algorithm executes `MaxIter` iterations before reaching the convergence tolerance, then the function stops iterating and returns a warning message. The function can return more than one warning when either `Npass` or the number of `Lambda` values is greater than 1.

Example: `MaxIter=100`

Data Types: `single` | `double`

Maximum number of passes for lasso optimization, specified as a positive integer.

Example: `Npass=5`

Data Types: `single` | `double`

Relative tolerance on the regularized loss for lasso, specified as a numeric positive scalar.

Example: `Reltol=1e-4`

Data Types: `single` | `double`

Verbosity level, specified as `0` or `1`. When this argument is set to `1`, `regularize` displays more information during the regularization process.

Example: `Verbose=1`

Data Types: `single` | `double`

collapse all

Lasso

The lasso algorithm finds an optimal set of learner weights αt that minimize

`$\sum _{n=1}^{N}{w}_{n}g\left(\left(\sum _{t=1}^{T}{\alpha }_{t}{h}_{t}\left({x}_{n}\right)\right),{y}_{n}\right)+\lambda \sum _{t=1}^{T}|{\alpha }_{t}|.$`

Here

• λ ≥ 0 is a parameter you provide, called the lasso parameter.

• ht is a weak learner in the ensemble trained on N observations with predictors xn, responses yn, and weights wn.

• g(f,y) = (fy)2 is the squared error.

Version History

Introduced in R2011a