CompactRegressionSVM

Namespace: classreg.learning.regr

Compact support vector machine regression model

Description

CompactRegressionSVM is a compact support vector machine (SVM) regression model. It consumes less memory than a full, trained support vector machine model (RegressionSVM model) because it does not store the data used to train the model.

Because the compact model does not store the training data, you cannot use it to perform certain tasks, such as cross validation. However, you can use a compact SVM regression model to predict responses using new input data.

Construction

compactMdl = compact(mdl) returns a compact SVM regression model compactMdl from a full, trained SVM regression model, mdl. For more information, see compact.

Input Arguments

expand all

`mdl` — Full, trained SVM regression model
`RegressionSVM` model

Full, trained SVM regression model, specified as a RegressionSVM model returned by fitrsvm.

Properties

expand all

`Alpha` — Dual problem coefficients
vector of numeric values

Dual problem coefficients, specified as a vector of numeric values. Alpha contains m elements, where m is the number of support vectors in the trained SVM regression model. The dual problem introduces two Lagrange multipliers for each support vector. The values of Alpha are the differences between the two estimated Lagrange multipliers for the support vectors. For more details, see Understanding Support Vector Machine Regression.

If you specified to remove duplicates using RemoveDuplicates, then, for a particular set of duplicate observations that are support vectors, Alpha contains one coefficient corresponding to the entire set. That is, MATLAB^® attributes a nonzero coefficient to one observation from the set of duplicates and a coefficient of 0 to all other duplicate observations in the set.

Data Types: single | double

`Beta` — Primal linear problem coefficients
vector of numeric values | `'[]'`

Primal linear problem coefficients, stored as a numeric vector of length p, where p is the number of predictors in the SVM regression model.

The values in Beta are the linear coefficients for the primal optimization problem.

If the model is obtained using a kernel function other than 'linear', this property is empty ('[]').

The predict method computes predicted response values for the model as YFIT = (X/S)×Beta + Bias, whereS is the value of the kernel scale stored in the KernelParameters.Scale property.

Data Types: double

`Bias` — Bias term
scalar value

Bias term in the SVM regression model, stored as a scalar value.

Data Types: double

`CategoricalPredictors` — Indices of categorical predictors
vector of positive integers | `[]`

This property is read-only.

Categorical predictor indices, specified as a vector of positive integers. CategoricalPredictors contains index values indicating that the corresponding predictors are categorical. The index values are between 1 and p, where p is the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty ([]).

Data Types: single | double

`ExpandedPredictorNames` — Expanded predictor names
cell array of character vectors

Expanded predictor names, stored as a cell array of character vectors.

If the model uses encoding for categorical variables, then ExpandedPredictorNames includes the names that describe the expanded variables. Otherwise, ExpandedPredictorNames is the same as PredictorNames.

Data Types: cell

`KernelParameters` — Kernel function parameters
structure

Kernel function parameters, stored as a structure with the following fields.

Field	Description
`Function`	Kernel function name (a character vector).
`Scale`	Numeric scale factor used to divide predictor values.

You can specify values for KernelParameters.Function and KernelParameters.Scale by using the KernelFunction and KernelScale name-value pair arguments in fitrsvm, respectively.

Data Types: struct

`Mu` — Predictor means
vector of numeric values | `'[]'`

Predictor means, stored as a vector of numeric values.

If the training data is standardized, then Mu is a numeric vector of length p, where p is the number of predictors used to train the model. In this case, the predict method centers predictor matrix X by subtracting the corresponding element of Mu from each column.

If the training data is not standardized, then Mu is empty ('[]').

Data Types: single | double

`PredictorNames` — Predictor names
cell array of character vectors

Predictor names, stored as a cell array of character vectors containing the name of each predictor in the order in which they appear in X. PredictorNames has a length equal to the number of columns in X.

Data Types: cell

`ResponseName` — Response variable name
character vector

Response variable name, stored as a character vector.

Data Types: char

`ResponseTransform` — Response transformation function
`'none'` | function handle

Response transformation function, specified as 'none' or a function handle. ResponseTransform describes how the software transforms raw response values.

For a MATLAB function or a function that you define, enter its function handle. For example, you can enter Mdl.ResponseTransform = @function, where function accepts a numeric vector of the original responses and returns a numeric vector of the same size containing the transformed responses.

Data Types: char | function_handle

`Sigma` — Predictor standard deviations
vector of numeric values | `'[]'`

Predictor standard deviations, stored as a vector of numeric values.

If the training data is standardized, then Sigma is a numeric vector of length p, where p is the number of predictors used to train the model. In this case, the predict method scales the predictor matrix X by dividing each column by the corresponding element of Sigma, after centering each element using Mu.

If the training data is not standardized, then Sigma is empty ('[]').

Data Types: single | double

`SupportVectors` — Support vectors
matrix of numeric values

Support vectors, stored as an m-by-p matrix of numeric values. m is the number of support vectors (sum(Mdl.IsSupportVector)), and p is the number of predictors in X.

If you specified to remove duplicates using RemoveDuplicates, then for a given set of duplicate observations that are support vectors, SupportVectors contains one unique support vector.

Data Types: single | double

Object Functions

`discardSupportVectors`	Discard support vectors for linear support vector machine (SVM) regression model
`gather`	Gather properties of Statistics and Machine Learning Toolbox object from GPU
`incrementalLearner`	Convert support vector machine (SVM) regression model to incremental learner
`lime`	Local interpretable model-agnostic explanations (LIME)
`loss`	Regression error for support vector machine regression model
`partialDependence`	Compute partial dependence
`plotPartialDependence`	Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots
`predict`	Predict responses using support vector machine regression model
`shapley`	Shapley values
`update`	Update model parameters for code generation

Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects.

Examples

collapse all

Compact an SVM Regression Model

This example shows how to reduce the size of a full, trained SVM regression model by discarding the training data and some information related to the training process.

This example uses the abalone data from the UCI Machine Learning Repository. Download the data and save it in your current directory with the name 'abalone.data'. Read the data into a table.

tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false);
rng default  % for reproducibility

The sample data contains 4177 observations. All of the predictor variables are continuous except for sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings on the abalone, and thereby determine its age, using physical measurements.

Train an SVM regression model using a Gaussian kernel function and an automatic kernel scale. Standardize the data.

mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale','auto','Standardize',true)

mdl = 

  RegressionSVM
           PredictorNames: {1x8 cell}
             ResponseName: 'Var9'
    CategoricalPredictors: 1
        ResponseTransform: 'none'
                    Alpha: [3635x1 double]
                     Bias: 10.8144
         KernelParameters: [1x1 struct]
                       Mu: [1x10 double]
                    Sigma: [1x10 double]
          NumObservations: 4177
           BoxConstraints: [4177x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [4177x1 logical]
                   Solver: 'SMO'


  Properties, Methods

Compact the model.

compactMdl = compact(mdl)

compactMdl = 

  classreg.learning.regr.CompactRegressionSVM
           PredictorNames: {1x8 cell}
             ResponseName: 'Var9'
    CategoricalPredictors: 1
        ResponseTransform: 'none'
                    Alpha: [3635x1 double]
                     Bias: 10.8144
         KernelParameters: [1x1 struct]
                       Mu: [1x10 double]
                    Sigma: [1x10 double]
           SupportVectors: [3635x10 double]


  Properties, Methods

The compacted model discards the training data and some information related to the training process.

Compare the size of the full model mdl and the compact model compactMdl.

vars = whos('compactMdl','mdl');
[vars(1).bytes,vars(2).bytes]

ans =

      323793      775968

The compacted model consumes about half the memory of the full model.

References

[1] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait." Sea Fisheries Division, Technical Report No. 48, 1994.

[2] Waugh, S. "Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks." University of Tasmania Department of Computer Science thesis, 1995.

[3] Clark, D., Z. Schreter, A. Adams. "A Quantitative Comparison of Dystal and Backpropagation." submitted to the Australian Conference on Neural Networks, 1996.

[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

The predict and update functions support code generation.

To integrate the prediction of an SVM regression model into Simulink^®, you can use the RegressionSVM Predict block in the Statistics and Machine Learning Toolbox™ library or a MATLAB Function block with the predict function.
When you train an SVM regression model by using fitrsvm, the following restrictions apply.
- The value of the ResponseTransform name-value argument cannot be an anonymous function. For fixed-point code generation, the value must be 'none' (default).
- For fixed-point code generation, the value of the KernelFunction name-value argument must be 'gaussian', 'linear', or 'polynomial'.
- Fixed-point code generation and code generation with a coder configurer do not support categorical predictors (logical, categorical, char, string, or cell). You cannot use the CategoricalPredictors name-value argument. To include categorical predictors in a model, preprocess them by using dummyvar before fitting the model.

For more information, see Introduction to Code Generation.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™. (since R2023a)

Usage notes and limitations:

The following object functions fully support GPU arrays:
The object functions execute on a GPU if at least one of the following applies:
- The model was fitted with GPU arrays.
- The predictor data that you pass to the object function is a GPU array.
- The response data that you pass to the object function is a GPU array.

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2015b

expand all

R2023a: GPU array support for object functions (requires Parallel Computing Toolbox)

Starting in R2023a, you can fit a CompactRegressionSVM object on a GPU by using compact. Most CompactRegressionSVM object functions now support GPU array input arguments so that they can execute on a GPU. The object functions that do not support GPU array inputs are incrementalLearner, lime, shapley, and update.

CompactRegressionSVM

Description

Construction

Input Arguments

mdl — Full, trained SVM regression model RegressionSVM model

Properties

Alpha — Dual problem coefficients vector of numeric values

Beta — Primal linear problem coefficients vector of numeric values | '[]'

Bias — Bias term scalar value

CategoricalPredictors — Indices of categorical predictors vector of positive integers | []

ExpandedPredictorNames — Expanded predictor names cell array of character vectors

KernelParameters — Kernel function parameters structure

Mu — Predictor means vector of numeric values | '[]'

PredictorNames — Predictor names cell array of character vectors

ResponseName — Response variable name character vector

ResponseTransform — Response transformation function 'none' | function handle

Sigma — Predictor standard deviations vector of numeric values | '[]'

SupportVectors — Support vectors matrix of numeric values

Object Functions

Copy Semantics

Examples

Compact an SVM Regression Model

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™. (since R2023a)

Version History

R2023a: GPU array support for object functions (requires Parallel Computing Toolbox)

See Also

`mdl` — Full, trained SVM regression model
`RegressionSVM` model

`Alpha` — Dual problem coefficients
vector of numeric values

`Beta` — Primal linear problem coefficients
vector of numeric values | `'[]'`

`Bias` — Bias term
scalar value

`CategoricalPredictors` — Indices of categorical predictors
vector of positive integers | `[]`

`ExpandedPredictorNames` — Expanded predictor names
cell array of character vectors

`KernelParameters` — Kernel function parameters
structure

`Mu` — Predictor means
vector of numeric values | `'[]'`

`PredictorNames` — Predictor names
cell array of character vectors

`ResponseName` — Response variable name
character vector

`ResponseTransform` — Response transformation function
`'none'` | function handle

`Sigma` — Predictor standard deviations
vector of numeric values | `'[]'`

`SupportVectors` — Support vectors
matrix of numeric values

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™. (since R2023a)