Main Content

crossval

Class: RegressionSVM

Cross-validated support vector machine regression model

Syntax

CVMdl = crossval(mdl)
CVMdl = crossval(mdl,Name,Value)

Description

CVMdl = crossval(mdl) returns a cross-validated (partitioned) support vector machine regression model, CVMdl, from a trained SVM regression model, mdl.

CVMdl = crossval(mdl,Name,Value) returns a cross-validated model with additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

Full, trained SVM regression model, specified as a RegressionSVM model returned by fitrsvm.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Cross-validation partition, specified as a cvpartition object that specifies the type of cross-validation and the indexing for the training and validation sets.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Suppose you create a random partition for 5-fold cross-validation on 500 observations by using cvp = cvpartition(500,KFold=5). Then, you can specify the cross-validation partition by setting CVPartition=cvp.

Fraction of the data used for holdout validation, specified as a scalar value in the range [0,1]. If you specify Holdout=p, then the software completes these steps:

  1. Randomly select and reserve p*100% of the data as validation data, and train the model using the rest of the data.

  2. Store the compact trained model in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Holdout=0.1

Data Types: double | single

Number of folds to use in the cross-validated model, specified as a positive integer value greater than 1. If you specify KFold=k, then the software completes these steps:

  1. Randomly partition the data into k sets.

  2. For each set, reserve the set as validation data, and train the model using the other k – 1 sets.

  3. Store the k compact trained models in a k-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: KFold=5

Data Types: single | double

Leave-one-out cross-validation flag, specified as "on" or "off". If you specify Leaveout="on", then for each of the n observations (where n is the number of observations, excluding missing observations, specified in the NumObservations property of the model), the software completes these steps:

  1. Reserve the one observation as validation data, and train the model using the other n – 1 observations.

  2. Store the n compact trained models in an n-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Leaveout="on"

Data Types: char | string

Output Arguments

expand all

Cross-validated SVM regression model, returned as a RegressionPartitionedSVM model.

Examples

expand all

This example shows how to train a cross-validated SVM regression model using crossval.

This example uses the abalone data from the UCI Machine Learning Repository. Download the data and save it in your current folder with the name 'abalone.data'. Read the data into a table.

tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false);
rng default  % for reproducibility

The sample data contains 4177 observations. All the predictor variables are continuous except for sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings on the abalone and determine its age using physical measurements.

Train an SVM regression model, using a Gaussian kernel function with a kernel scale equal to 2.2. Standardize the data.

mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale',2.2,'Standardize',true);

mdl is a trained RegressionSVM regression model.

Cross validate the model using 10-fold cross validation.

CVMdl = crossval(mdl)
CVMdl = 

  classreg.learning.partition.RegressionPartitionedSVM
      CrossValidatedModel: 'SVM'
           PredictorNames: {1x8 cell}
    CategoricalPredictors: 1
             ResponseName: 'Var9'
          NumObservations: 4177
                    KFold: 10
                Partition: [1x1 cvpartition]
        ResponseTransform: 'none'


  Properties, Methods

CVMdl is a RegressionPartitionedSVM cross-validated regression model. The software:

1. Randomly partitions the data into 10 equally sized sets.

2. Trains an SVM regression model on nine of the 10 sets.

3. Repeats steps 1 and 2 k = 10 times. It leaves out one of the partitions each time, and trains on the other nine partitions.

4. Combines generalization statistics for each fold.

Calculate the resubstitution loss for the cross-validated model.

loss = kfoldLoss(CVMdl)
loss =

    4.5712

This example shows how to specify a holdout proportion for training a cross-validated SVM regression model.

This example uses the abalone data from the UCI Machine Learning Repository. Download the data and save it in your current folder with the name 'abalone.data'. Read the data into a table.

tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false);
rng default  % for reproducibility

The sample data contains 4177 observations. All the predictor variables are continuous except for sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings on the abalone and determine its age using physical measurements.

Train an SVM regression model, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.

mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale','auto','Standardize',true);

mdl is a trained RegressionSVM regression model.

Cross validate the regression model by specifying a 10% holdout sample.

CVMdl = crossval(mdl,'Holdout',0.1)
CVMdl = 

  classreg.learning.partition.RegressionPartitionedSVM
      CrossValidatedModel: 'SVM'
           PredictorNames: {1x8 cell}
    CategoricalPredictors: 1
             ResponseName: 'Var9'
          NumObservations: 4177
                    KFold: 1
                Partition: [1x1 cvpartition]
        ResponseTransform: 'none'


  Properties, Methods

CVMdl is a RegressionPartitionedSVM model object.

Calculate the resubstitution loss for the cross-validated model.

loss = kfoldLoss(CVMdl)
loss =

    5.2499

Alternatives

Instead of training an SVM regression model and then cross-validating it, you can create a cross-validated model directly by using fitrsvm and specifying any of these name-value pair arguments: 'CrossVal', 'CVPartition', 'Holdout', 'Leaveout', or 'KFold'.

References

[1] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait." Sea Fisheries Division, Technical Report No. 48, 1994.

[2] Waugh, S. "Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks." University of Tasmania Department of Computer Science thesis, 1995.

[3] Clark, D., Z. Schreter, A. Adams. "A Quantitative Comparison of Dystal and Backpropagation." submitted to the Australian Conference on Neural Networks, 1996.

[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

Extended Capabilities

Version History

Introduced in R2015b

expand all