# fitPosterior

Fit posterior probabilities for compact support vector machine (SVM) classifier

## Description

ScoreSVMModel = fitPosterior(SVMModel,TBL,Y) returns a trained support vector machine (SVM) classifier ScoreSVMModel containing the optimal score-to-posterior-probability transformation function for two-class learning. For more details, see Algorithms. If you train SVMModel using a table, then you must use a table as input for fitPosterior.

example

ScoreSVMModel = fitPosterior(SVMModel,X,Y) returns a trained SVM classifier ScoreSVMModel containing the optimal score-to-posterior-probability transformation function for two-class learning. If you train SVMModel using a matrix, then you must use a matrix as input for fitPosterior.

example

[ScoreSVMModel,ScoreTransform] = fitPosterior(___) additionally returns the optimal score-to-posterior-probability transformation function parameters (ScoreTransform) for any of the input argument combinations in the previous syntaxes.

## Examples

collapse all

Load the ionosphere data set. Reserve 20 random observations of the data, and consider this set new data.

n = size(X,1);
rng(1);  % For reproducibility

indx = ~ismember([1:n],randsample(n,20)); % Indices for the training data

The classes of this data set are inseparable.

Train an SVM classifier using the training data. Standardize the data and specify that 'g' is the positive class.

SVMModel = fitcsvm(X(indx,:),Y(indx),'ClassNames',{'b','g'},...
'Standardize',true);

SVMModel is a ClassificationSVM classifier.

Use the new data set to estimate the optimal score-to-posterior-probability transformation function for mapping scores to the posterior probability of an observation being classified as g. For efficiency, make a compact version of SVMModel, and pass it and the new data to fitPosterior.

CompactSVMModel = compact(SVMModel);
[ScoreCSVMModel,ScoreParameters] = fitPosterior(CompactSVMModel,...
X(~indx,:),Y(~indx));

ScoreTransform = ScoreCSVMModel.ScoreTransform
ScoreTransform =
'@(S)sigmoid(S,-1.098982e+00,4.521089e-01)'
ScoreParameters
ScoreParameters = struct with fields:
Type: 'sigmoid'
Slope: -1.0990
Intercept: 0.4521

ScoreTransform is the optimal score transformation function. ScoreParameters is a structure array with three fields: the score transformation function name (Type), the sigmoid slope (Slope), and the sigmoid intercept estimates (Intercept).

Alternatively, you can pass SVMModel and the new data to fitSVMPosterior, but this process is not as efficient.

Estimate the posterior probabilities that the observations in the new data are in class g.

[labels,postProbs] = predict(ScoreCSVMModel,X(~indx,:));
table(Y(~indx),labels,postProbs(:,2),...
'VariableNames',{'TrueLabel','PredictedLabel','PosteriorProbability'})
ans=20×3 table
TrueLabel    PredictedLabel    PosteriorProbability
_________    ______________    ____________________

{'g'}          {'g'}                  0.7844
{'b'}          {'b'}                0.024579
{'g'}          {'g'}                 0.82402
{'b'}          {'b'}               0.0061634
{'b'}          {'b'}              3.6099e-06
{'b'}          {'b'}                 0.15687
{'b'}          {'g'}                 0.96219
{'b'}          {'b'}              6.1317e-09
{'b'}          {'b'}               0.0019642
{'g'}          {'g'}                 0.72507
{'g'}          {'g'}                 0.70265
{'b'}          {'b'}                0.075291
{'g'}          {'g'}                 0.90691
{'g'}          {'g'}                 0.82844
{'b'}          {'b'}                0.051179
{'g'}          {'g'}                  0.9533
⋮

Load Fisher's iris data set. Use the petal lengths and widths as the predictor data, and remove the virginica species from the data. Reserve 10 random observations of the data, and consider this set new data.

classKeep = ~strcmp(species,'virginica');
X = meas(classKeep,3:4);
Y = species(classKeep);

rng(1);  % For reproducibility
indx1 = 1:numel(species);
indx2 = indx1(classKeep);
indx = ~ismember(indx2,randsample(indx2,10)); % Indices for the training data

gscatter(X(indx,1),X(indx,2),Y(indx));
title('Scatter Diagram of Iris Measurements')
xlabel('Petal length')
ylabel('Petal width')
legend('Setosa','Versicolor')

The classes are perfectly separable. Therefore, the score-to-posterior-probability transformation function is a step function.

Train an SVM classifier. Standardize the data and specify that versicolor is the positive class.

SVMModel = fitcsvm(X(indx,:),Y(indx),...
'ClassNames',{'setosa','versicolor'},'Standardize',true);

SVMModel is a ClassificationSVM classifier.

Use the new data set to estimate the optimal score-to-posterior-probability transformation function for mapping scores to the posterior probability of an observation being classified as versicolor. For efficiency, make a compact version SVMModel, and pass it and the new data to fitPosterior.

CompactSVMModel = compact(SVMModel);
[ScoreCSVMModel,ScoreParameters] = fitPosterior(CompactSVMModel,...
X(~indx,:),Y(~indx));
Warning: Classes are perfectly separated. The optimal score-to-posterior transformation is a step function.
ScoreTransform = ScoreCSVMModel.ScoreTransform
ScoreTransform =
'@(S)step(S,-1.338450e+00,2.012495e+00,5.333333e-01)'

fitPosterior displays a warning whenever the classes are separable, and stores the step function in ScoreSVMModel.ScoreTransform.

Display the score function type and its estimated values.

ScoreParameters
ScoreParameters = struct with fields:
Type: 'step'
LowerBound: -1.3385
UpperBound: 2.0125
PositiveClassProbability: 0.5333

ScoreParameters is a structure array with four fields:

• Score transformation function type (Type)

• Score corresponding to the negative class boundary (LowerBound)

• Score corresponding to the positive class boundary (UpperBound)

• Positive class probability (PositiveClassProbability)

Alternatively, you can pass SVMModel and the new data to fitSVMPosterior, but this process is not as efficient.

Estimate the posterior probabilities that the observations in the new data are versicolor irises.

[labels,postProbs] = predict(ScoreCSVMModel,X(~indx,:));
table(Y(~indx),labels,postProbs(:,2),...
'VariableNames',{'TrueLabel','PredictedLabel','PosteriorProbability'})
ans=10×3 table
TrueLabel       PredictedLabel    PosteriorProbability
______________    ______________    ____________________

{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'setosa'    }    {'setosa'    }             0
{'versicolor'}    {'versicolor'}             1
{'versicolor'}    {'versicolor'}             1

Because the classes are separable, the step function transforms the positive-class score to:

• 0 if the score is less than ScoreParameters.LowerBound

• 1 if the score is greater than ScoreParameters.UpperBound

• ScoreParameters.PositiveClassProbability if the score is in the interval [ ScoreParameters.LowerBound , ScoreParameters.LowerBound]

## Input Arguments

collapse all

Trained, compact SVM classifier, specified as a CompactClassificationSVM model returned by compact.

Sample data, specified as a table. Each row of TBL corresponds to one observation, and each column corresponds to one predictor variable. TBL must contain all of the predictors used to train SVMModel. Optionally, TBL can contain an additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If TBL contains the response variable used to train SVMModel, then you do not need to specify Y. If TBL does not include the response variable, then the length of Y must be equal to the number of rows in TBL.

If the sample data used to train SVMModel is a table, then you must specify the input data for fitPosterior as a table.

If you set 'Standardize',true in fitcsvm when training SVMModel, then the software fits the transformation function parameter estimates using standardized data.

Data Types: table

Predictor data used to estimate the score-to-posterior-probability transformation function, specified as a matrix.

Each row of X corresponds to one observation (also known as an instance or example), and each column corresponds to one variable (also known as a feature).

The length of Y and the number of rows in X must be equal.

If you set 'Standardize',true in fitcsvm when training SVMModel, then the software fits the transformation function parameter estimates using standardized data.

Data Types: double | single

Class labels used to estimate the score-to-posterior-probability transformation function, specified as a categorical, character, or string array, a logical or numeric vector, or a cell array of character vectors.

If Y is a character array, then each element must correspond to one class label.

The length of Y and the number of rows in X must be equal.

Data Types: categorical | char | string | logical | single | double | cell

## Output Arguments

collapse all

Trained, compact SVM classifier containing the estimated score-to-posterior-probability transformation function, returned as a CompactClassificationSVM classifier.

To estimate posterior probabilities for new observations, pass ScoreSVMModel and the new observations to predict.

Optimal score-to-posterior-probability transformation function parameters, returned as a structure array.

• If the value of the Type field of ScoreTransform is sigmoid, then ScoreTransform also has these fields:

• Slope: The value of A in the sigmoid function

• Intercept: The value of B in the sigmoid function

• If the value of the Type field of ScoreTransform is step, then ScoreTransform also has these fields:

• PositiveClassProbability: The value of π in the step function. This value represents the probability that an observation is in the positive class or the posterior probability that an observation is in the positive class given that its score is in the interval (LowerBound,UpperBound).

• LowerBound: The value $\underset{{y}_{n}=-1}{\mathrm{max}}{s}_{n}$ in the step function. This value represents the lower bound of the score interval that assigns observations with scores in the interval the posterior probability of being in the positive class PositiveClassProbability. Any observation with a score less than LowerBound has the posterior probability of being in the positive class equal to 0.

• UpperBound: The value $\underset{{y}_{n}=+1}{\mathrm{min}}{s}_{n}$ in the step function. This value represents the upper bound of the score interval that assigns observations with scores in the interval the posterior probability of being in the positive class PositiveClassProbability. Any observation with a score greater than UpperBound has the posterior probability of being in the positive class equal to 1.

• If the value of the Type field of ScoreTransform is constant, then ScoreTransform.PredictedClass contains the name of the class prediction.

This result is the same as SVMModel.ClassNames. The posterior probability of an observation being in ScoreTransform.PredictedClass is always 1.

collapse all

### Sigmoid Function

The sigmoid function that maps score sj corresponding to observation j to the positive class posterior probability is

$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)}.$

If the value of the Type field of ScoreTransform is sigmoid, then parameters A and B correspond to the fields Scale and Intercept of ScoreTransform, respectively.

### Step Function

The step function that maps score sj corresponding to observation j to the positive class posterior probability is

$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

If the value of the Type field of ScoreTransform is step, then the quantities $\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}$ and $\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}$ correspond to the fields LowerBound and UpperBound of ScoreTransform, respectively.

### Constant Function

The constant function maps all scores in a sample to posterior probabilities 1 or 0.

If all observations have posterior probability 1, then they are expected to come from the positive class.

If all observations have posterior probability 0, then they are not expected to come from the positive class.

## Tips

• This process describes one way to predict positive class posterior probabilities.

1. Train an SVM classifier by passing the data to fitcsvm. The result is a trained SVM classifier, such as SVMModel, that stores the data. The software sets the score transformation function property (SVMModel.ScoreTransformation) to none.

2. Pass the trained SVM classifier SVMModel to fitSVMPosterior or fitPosterior. The result, such as, ScoreSVMModel, is the same trained SVM classifier as SVMModel, except the software sets ScoreSVMModel.ScoreTransformation to the optimal score transformation function.

3. Pass the predictor data matrix and the trained SVM classifier containing the optimal score transformation function (ScoreSVMModel) to predict. The second column in the second output argument of predict stores the positive class posterior probabilities corresponding to each row of the predictor data matrix.

If you skip step 2, then predict returns the positive class score rather than the positive class posterior probability.

• After fitting posterior probabilities, you can generate C/C++ code that predicts labels for new data. Generating C/C++ code requires MATLAB® Coder™. For details, see Introduction to Code Generation.

## Algorithms

The software fits the appropriate score-to-posterior-probability transformation function by using the SVM classifier SVMModel and by conducting 10-fold cross-validation using the stored predictor data (SVMModel.X) and the class labels (SVMModel.Y), as outlined in [1]. The transformation function computes the posterior probability that an observation is classified into the positive class (SVMModel.Classnames(2)).

• If the classes are inseparable, then the transformation function is the sigmoid function.

• If the classes are perfectly separable, then the transformation function is the step function.

• In two-class learning, if one of the two classes has a relative frequency of 0, then the transformation function is the constant function. The fitPosterior function is not appropriate for one-class learning.

• The software stores the optimal score-to-posterior-probability transformation function in ScoreSVMModel.ScoreTransform.

If you re-estimate the score-to-posterior-probability transformation function, that is, if you pass an SVM classifier to fitPosterior or fitSVMPosterior and its ScoreTransform property is not none, then the software:

• Displays a warning

• Resets the original transformation function to 'none' before estimating the new one

## Alternative Functionality

You can also fit the optimal score-to-posterior-probability function by using fitSVMPosterior. This function is similar to fitPosterior, except it is more broad because it accepts a wider range of SVM classifier types.

## References

[1] Platt, J. “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.” Advances in Large Margin Classifiers. Cambridge, MA: The MIT Press, 2000, pp. 61–74.