Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

**MathWorks Machine Translation**

The automated translation of this page is provided by a general purpose third party translator tool.

MathWorks does not warrant, and disclaims all liability for, the accuracy, suitability, or fitness for purpose of the translation.

Create Gaussian mixture model

A `gmdistribution`

object stores a Gaussian mixture
distribution, also called a Gaussian mixture model (GMM), which is a multivariate
distribution that consists of multivariate Gaussian distribution components. Each
component is defined by its mean and covariance, and the mixture is defined by a vector
of mixing proportions.

You can create a `gmdistribution`

model object in two ways.

Use the

`gmdistribution`

function (described here) to create a`gmdistribution`

model object by specifying the distribution parameters.Use the

`fitgmdist`

function to fit a`gmdistribution`

model object to data given a fixed number of components.

`gm = gmdistribution(mu,sigma)`

`gm = gmdistribution(mu,sigma,p)`

`mu`

— MeansMeans of multivariate Gaussian distribution components,
specified as a *k*-by-*m* numeric matrix, where
*k* is the number of components and *m* is the number of
variables in each component. `mu(i,:)`

is the mean of component
`i`

.

**Data Types: **`single`

| `double`

`sigma`

— Covariancesnumeric vector | numeric matrix | numeric array

Covariances of multivariate Gaussian distribution components, specified as a numeric vector, matrix, or array.

Given that *k* is the number of components and
*m* is the number of variables in each component,
`sigma`

is one of the values in this
table.

Value | Description |
---|---|

m-by-m-by-k
array | `sigma(:,:,i)` is the covariance
matrix of component `i` . |

1-by-m-by-k
array | Covariance matrices are diagonal.
`sigma(1,:,i)` contains the
diagonal elements of the covariance matrix of
component `i` . |

m-by-m
matrix | Covariance matrices are the same across components. |

1-by-m vector | Covariance matrices are diagonal and the same across components. |

**Data Types: **`single`

| `double`

`p`

— Mixing proportions of mixture componentsnumeric vector of length

Mixing proportions of mixture components, specified as a numeric
vector of length *k*, where *k* is the
number of components. The default is a row vector of
(1/*k*)s, which sets equal proportions. If
`p`

does not sum to `1`

,
`gmdistribution`

normalizes it.

**Data Types: **`single`

| `double`

`mu`

— MeansThis property is read-only.

Means of multivariate Gaussian distribution components,
specified as a *k*-by-*m* numeric matrix, where
*k* is the number of components and *m* is the number of
variables in each component. `mu(i,:)`

is the mean of component
`i`

.

**Data Types: **`single`

| `double`

`Sigma`

— Covariancesnumeric vector | numeric matrix | numeric array

This property is read-only.

Covariances of multivariate Gaussian distribution components, specified as a numeric vector, matrix, or array.

Given that *k* is the number of components and
*m* is the number of variables in each component,
`Sigma`

is one of the values in this
table.

Value | Description |
---|---|

m-by-m-by-k
array | `Sigma(:,:,i)` is the covariance
matrix of component `i` . |

1-by-m-by-k
array | Covariance matrices are diagonal.
`Sigma(1,:,i)` contains the
diagonal elements of the covariance matrix of
component `i` . |

m-by-m
matrix | Covariance matrices are the same across components. |

1-by-m vector | Covariance matrices are diagonal and the same across components. |

**Data Types: **`single`

| `double`

`ComponentProportion`

— Mixing proportions of mixture components1-by-

This property is read-only.

Mixing proportions of mixture components, specified as a
1-by-*k* numeric vector.

**Data Types: **`single`

| `double`

`CovarianceType`

— Type of covariance matrices`'diagonal'`

| `'full'`

This property is read-only.

Type of covariance matrices, specified as either
`'diagonal'`

or `'full'`

.

If you create a

`gmdistribution`

object by using the`gmdistribution`

function, then the type of covariance matrices in the`sigma`

input argument of`gmdistribution`

sets this property.If you fit a

`gmdistribution`

object to data by using the`fitgmdist`

function, then the`'CovarianceType'`

name-value pair argument of`fitgmdist`

sets this property.

`DistributionName`

— Distribution name```
'gaussian mixture
distribution'
```

(default)This property is read-only.

Distribution name, specified as ```
'gaussian mixture
distribution'
```

.

`NumComponents`

— Number of mixture componentspositive integer

This property is read-only.

Number of mixture components, *k*, specified as a
positive integer.

**Data Types: **`single`

| `double`

`NumVariables`

— Number of variablespositive integer

This property is read-only.

Number of variables in the multivariate Gaussian distribution
components, *m*, specified as a positive
integer.

**Data Types: **`double`

`SharedCovariance`

— Flag indicating shared covariance`true`

| `false`

This property is read-only.

Flag indicating whether a covariance matrix is shared across mixture
components, specified as `true`

or
`false`

.

If you create a

`gmdistribution`

object by using the`gmdistribution`

function, then the type of covariance matrices in the`sigma`

input argument of`gmdistribution`

sets this property.If you fit a

`gmdistribution`

object to data by using the`fitgmdist`

function, then the`'SharedCovariance'`

name-value pair argument of`fitgmdist`

sets this property.

**Data Types: **`logical`

The following properties apply only to a fitted object you create by using
`fitgmdist`

. The values of these
properties are empty if you create a `gmdistribution`

object by using
the `gmdistribution`

function.

`AIC`

— Akaike Information Criterionscalar

This property is read-only.

Akaike information criterion (AIC), specified as a scalar.
`AIC = 2*NlogL + 2*p`

, where
`NlogL`

is the negative loglikelihood (the
`NegativeLogLikelihood`

property) and
`p`

is the number of estimated parameters.

AIC is a model selection tool you can use to compare multiple models fit to the same data. AIC is a likelihood-based measure of model fit that includes a penalty for complexity, specifically, the number of parameters. When you compare multiple models, a model with a smaller value of AIC is better.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`single`

| `double`

`BIC`

— Bayes Information Criterionscalar

This property is read-only.

Bayes information criterion (BIC), specified as a scalar. ```
BIC
= 2*NlogL + p*log(n)
```

, where `NlogL`

is
the negative loglikelihood (the
`NegativeLogLikelihood`

property),
`n`

is the number of observations, and
`p`

is the number of estimated parameters.

BIC is a model selection tool you can use to compare multiple models fit to the same data. BIC is a likelihood-based measure of model fit that includes a penalty for complexity, specifically, the number of parameters. When you compare multiple models, a model with the lowest BIC value is the best fitting model.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`single`

| `double`

`Converged`

— Flag indicating convergence`true`

| `false`

This property is read-only.

Flag indicating whether the Expectation-Maximization (EM) algorithm is
converged when fitting a Gaussian mixture model, specified as
`true`

or `false`

.

You can change the optimization options by using the `'Options'`

name-value pair argument of `fitgmdist`

.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`logical`

`NegativeLogLikelihood`

— Negative loglikelihoodscalar

This property is read-only.

Negative loglikelihood of the fitted Gaussian mixture model given the
input data `X`

of
`fitgmdist`

, specified as a scalar.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`single`

| `double`

`NumIterations`

— Number of iterationspositive integer

This property is read-only.

Number of iterations in the Expectation-Maximization (EM) algorithm, specified as a positive integer.

You can change the optimization options, including the maximum number
of iterations allowed, by using the `'Options'`

name-value pair argument of `fitgmdist`

.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`double`

`ProbabilityTolerance`

— Tolerance for posterior probabilitiesnonnegative scalar value in range

`[0,1e-6]`

This property is read-only.

Tolerance for posterior probabilities, specified as a nonnegative
scalar value in the range `[0,1e-6]`

.

The `'ProbabilityTolerance'`

name-value pair argument of
`fitgmdist`

sets this property.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`single`

| `double`

`RegularizationValue`

— Regularization parameter valuenonnegative scalar

This property is read-only.

Regularization parameter value, specified as a nonnegative scalar.

The `'RegularizationValue'`

name-value pair argument of
`fitgmdist`

sets this property.

This property is empty if you create a `gmdistribution`

object by using the `gmdistribution`

function.

**Data Types: **`single`

| `double`

`cdf` | Cumulative distribution function for Gaussian mixture distribution |

`cluster` | Construct clusters from Gaussian mixture distribution |

`mahal` | Mahalanobis distance to Gaussian mixture component |

`pdf` | Probability density function for Gaussian mixture distribution |

`posterior` | Posterior probability of Gaussian mixture component |

`random` | Random variate from Gaussian mixture distribution |

`gmdistribution`

Create a two-component bivariate Gaussian mixture distribution by using the `gmdistribution`

function.

Define the distribution parameters (means and covariances) of two bivariate Gaussian mixture components.

```
mu = [1 2;-3 -5];
sigma = cat(3,[2 .5],[1 1]) % 1-by-2-by-2 array
```

sigma = sigma(:,:,1) = 2.0000 0.5000 sigma(:,:,2) = 1 1

The `cat`

function concatenates the covariances along the third array dimension. The defined covariance matrices are diagonal matrices. `sigma(1,:,i)`

contains the diagonal elements of the covariance matrix of component `i`

.

Create a `gmdistribution`

object. By default, the `gmdistribution`

function creates an equal proportion mixture.

gm = gmdistribution(mu,sigma)

gm = Gaussian mixture distribution with 2 components in 2 dimensions Component 1: Mixing proportion: 0.500000 Mean: 1 2 Component 2: Mixing proportion: 0.500000 Mean: -3 -5

List the properties of the `gm`

object.

properties(gm)

Properties for class gmdistribution: NumVariables DistributionName NumComponents ComponentProportion SharedCovariance NumIterations RegularizationValue NegativeLogLikelihood CovarianceType mu Sigma AIC BIC Converged ProbabilityTolerance

You can access these properties by using dot notation. For example, access the `ComponentProportion`

property, which represents the mixing proportions of mixture components.

gm.ComponentProportion

`ans = `*1×2*
0.5000 0.5000

A `gmdistribution`

object has properties that apply only to a fitted object. The fitted object properties are `AIC`

, `BIC`

, `Converged`

, `NegativeLogLikelihood`

, `NumIterations`

, `ProbabilityTolerance`

, and `RegularizationValue`

. The values of the fitted object properties are empty if you create an object by using the `gmdistribution`

function and specifying distribution parameters. For example, access the `NegativeLogLikelihood`

property by using dot notation.

gm.NegativeLogLikelihood

ans = []

After you create a `gmdistribution`

object, you can use the object functions. Use `cdf`

and `pdf`

to compute the values of the cumulative distribution function (cdf) and the probability density function (pdf). Use `random`

to generate random vectors. Use `cluster`

, `mahal`

, and `posterior`

for cluster analysis.

Visualize the object by using `pdf`

and `ezsurf`

.

ezsurf(@(x,y)pdf(gm,[x y]),[-10 10],[-10 10])

`fitgmdist`

Generate random variates that follow a mixture of two bivariate Gaussian distributions by using the `mvnrnd`

function. Fit a Gaussian mixture model (GMM) to the generated data by using the `fitgmdist`

function.

Define the distribution parameters (means and covariances) of two bivariate Gaussian mixture components.

mu1 = [1 2]; % Mean of the 1st component sigma1 = [2 0; 0 .5]; % Covariance of the 1st component mu2 = [-3 -5]; % Mean of the 2nd component sigma2 = [1 0; 0 1]; % Covariance of the 2nd component

Generate an equal number of random variates from each component, and combine the two sets of random variates.

rng('default') % For reproducibility r1 = mvnrnd(mu1,sigma1,1000); r2 = mvnrnd(mu2,sigma2,1000); X = [r1; r2];

The combined data set `X`

contains random variates following a mixture of two bivariate Gaussian distributions.

Fit a two-component GMM to `X`

.

gm = fitgmdist(X,2)

gm = Gaussian mixture distribution with 2 components in 2 dimensions Component 1: Mixing proportion: 0.500000 Mean: -2.9617 -4.9727 Component 2: Mixing proportion: 0.500000 Mean: 0.9539 2.0261

List the properties of the `gm`

object.

properties(gm)

Properties for class gmdistribution: NumVariables DistributionName NumComponents ComponentProportion SharedCovariance NumIterations RegularizationValue NegativeLogLikelihood CovarianceType mu Sigma AIC BIC Converged ProbabilityTolerance

You can access these properties by using dot notation. For example, access the `NegativeLogLikelihood`

property, which represents the negative loglikelihood of the data `X`

given the fitted model.

gm.NegativeLogLikelihood

ans = 7.0584e+03

After you create a `gmdistribution`

object, you can use the object functions. Use `cdf`

and `pdf`

to compute the values of the cumulative distribution function (cdf) and the probability density function (pdf). Use `random`

to generate random variates. Use `cluster`

, `mahal`

, and `posterior`

for cluster analysis.

Plot `X`

by using `scatter`

. Visualize the fitted model `gm`

by using `pdf`

and `ezcontour`

.

scatter(X(:,1),X(:,2),10,'.') % Scatter plot with points of size 10 hold on ezcontour(@(x,y)pdf(gm,[x y]),[-8 6],[-8 6])

[1] McLachlan, G., and D. Peel. *Finite Mixture
Models*. Hoboken, NJ: John Wiley & Sons, Inc., 2000.

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)