Expanding Sample Covariance Matrix

5 vues (au cours des 30 derniers jours)
Lemar DeSalis
Lemar DeSalis le 21 Août 2011
Hello!
I need to calculate the mean vector and the covariance matrix for sampled data. E.g. I have matrix with NumFeatures colums and NumSamples rows. I can then easily use "mean(MyMatrix)" and "cov(MyMatrix)".
However, what should I do if I want to extend the covariance matrix I got through the method described above?
So I have a covariance matrix calculated from the old samples, how can I add the influence of the new samples?
Is there an ease MATLAB-way to do that?
Thanks in advance!
  1 commentaire
Oleg Komarov
Oleg Komarov le 21 Août 2011
The terminology you're using is not clear. Could you give an example.
For reference: http://www.mathworks.com/matlabcentral/answers/6200-tutorial-how-to-ask-a-question-on-answers-and-get-a-fast-answer

Connectez-vous pour commenter.

Réponses (2)

Lemar DeSalis
Lemar DeSalis le 22 Août 2011
% MyMatrix is a Matrix containing samples, in this case random data:
MyMatrix = rand( [NumSamples NumFeatures] );
% I need the mean vector and the covariance matrix:
MyMean = mean(MyMatrix);
MyCov = cov(MyMatrix);
% Now I got some new data:
MyLargerMatrix = vertcat(MyMatrix, SomeNewData);
% Calculate new values:
MyMean_New1 = mean(MyLargerMatrix)
MyCov_New1 = cov(MyLargerMatrix);
%%%%HERE IS MY QUESTION:
% But what to do, when the old data is not available anymore?
clear MyLargerMatrix, MyMatrix;
MyCov_New2 = ... ?
% How to update the covariance matrix, if you only have the old
% covariance matrix "MyMean", the number of old samples "NumSamples"
% and the new samples "SomeNewData"?
%
% MyCov_New2 should be identical to MyCov_New1, but MyCov_New2
% should be computed WITHOUT access to the old data.
% For the mean vector, this is easily possible, but how to do so for the covariance matrix?

Oleg Komarov
Oleg Komarov le 22 Août 2011
% Example inputs
A = rand(100,2);
B = randn(20,2);
C = [A;B];
% Sample covariances (normalized by N-1)
c1 = cov(A);
c2 = cov(B);
c3 = cov(C);
% Means
m1 = mean(A);
m2 = mean(B);
m3 = mean(C);
% Number of samples
nA = size(A,1);
nB = size(B,1);
nC = nA + nB;
% The question is: how to get c3 having only c1, c2, m1, m2?
% Keep in mind that:
  • cov(x,y) = E(xy) - E(x)E(y)
  • m3 = (m1*nA + m2*nB)/nC
  • same with E(xy)
  • cov is the sample covariance, thus we have to adjust for N-1
  • the following formula is valid for covariance only for covariance
ExEy12 = prod((m1*nA + m2*nB)/nC);
adj = nC/(nC-1);
(c1*(nA-1) + c2*(nB-1) + prod(m1)*nA + prod(m2)*nB)/nC*adj - ExEy12 * adj
c3
How to derive the variance is up to you. But you really just need paper and pencil.
  1 commentaire
Lemar DeSalis
Lemar DeSalis le 23 Août 2011
Thanks, I was able to find a solution based on your code!

Connectez-vous pour commenter.

Catégories

En savoir plus sur Creating and Concatenating Matrices dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by