This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Creating Discriminant Analysis Model

The model for discriminant analysis is:

  • Each class (Y) generates data (X) using a multivariate normal distribution. In other words, the model assumes X has a Gaussian mixture distribution (gmdistribution).

    • For linear discriminant analysis, the model has the same covariance matrix for each class; only the means vary.

    • For quadratic discriminant analysis, both means and covariances of each class vary.

Under this modeling assumption, fitcdiscr infers the mean and covariance parameters of each class.

  • For linear discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariance by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of the result.

  • For quadratic discriminant analysis, it computes the sample mean of each class. Then it computes the sample covariances by first subtracting the sample mean of each class from the observations of that class, and taking the empirical covariance matrix of each class.

The fit method does not use prior probabilities or costs for fitting.

Weighted Observations

fitcdiscr constructs weighted classifiers using the following scheme. Suppose M is an N-by-K class membership matrix:

Mnk = 1 if observation n is from class k
Mnk = 0 otherwise.

The estimate of the class mean for unweighted data is

μ^k=n=1NMnkxnn=1NMnk.

For weighted data with positive weights wn, the natural generalization is

μ^k=n=1NMnkwnxnn=1NMnkwn.

The unbiased estimate of the pooled-in covariance matrix for unweighted data is

Σ^=n=1Nk=1KMnk(xnμ^k)(xnμ^k)TNK.

For quadratic discriminant analysis, fitcdiscr uses K = 1.

For weighted data, assuming the weights sum to 1, the unbiased estimate of the pooled-in covariance matrix is

Σ^=n=1Nk=1KMnkwn(xnμ^k)(xnμ^k)T1k=1KWk(2)Wk,

where

  • Wk=n=1NMnkwn is the sum of the weights for class k.

  • Wk(2)=n=1NMnkwn2 is the sum of squared weights per class.

See Also

Functions

Objects

Related Topics