Statistical difference between a number of matrices.

Hey,
I am trying to find a statistical approach to comparing the differences between matrices in a 3D matrix. I have a 3D matrix of m,n,z and I would like to look have a statistical measure (ideally a m,n matrix) that shows where the matrices most differ or are statistically different from their mean value.
I have come across a few tests from the statistical tool box that do this between two matrices, or even between 2 3D matrices, but I am not sure how to apply it to one 3D matrix. Perhaps a t-test (normalized by the mean) along the z axis that produces a mxn matrix of p-values would work here.
Hope my question makes sense.

 Réponse acceptée

If I understand correctly what you want to do, I would reshape it so that the entries along the third dimension of your original matrix are the columns of your reshaped matrix. Then choose a test (I’m using anova1 here), then use the multcompare function.
Example:
M = randi(9, 3, 4, 3) % Create Data
Mr = reshape(M, [], 3)' % Reshape (Obviously)
[p, t, stats] = anova1(Mr); % Create ‘stats’ Structure
[c,m,h,nms] = multcompare(stats); % Do Multiple Comparisons
There are likely a number of different approaches you can take. This is the one I consider most appropriate.

6 commentaires

mashtine
mashtine le 1 Juin 2016
Modifié(e) : mashtine le 1 Juin 2016
Hi,
Thanks for the response! Ideally I would like to know where in the matrix there is the largest difference or obtain some measure of error between all the matrices (perhaps using the MS output). I have thousands at times to compare and thus most of p values end up being '0' as they are highly significant.
My pleasure.
The multcompare function can probably tell you that information. You will probably have to experiment with it to see. I’m not sure that I understand what you’re doing, but just using z-statistics might be the way to go to find the ones that are of least or greatest difference. Those would not be statistically valid because of the multiple comparison problem, but would provide you a common standard to sort or test to find the ones that meet your criteria.
Again, I don’t know what your experiment design or reference mean is, so this is just a guess. I’ve never done anything similar to what you’re doing, so I’m also assuming a lot about your data and design that may not be valid.
Thanks again!
Basically, I have a coarse-grid mean wind speed bin (say 0-5 m/s) and I have corresponding fine-grid wind speeds for the wind speed bin which make up my 3D matrix. Thus, I am trying to see how much the fine-grid values (each one being from a different time) differ within a particular coarse wind speed bin? For instance, do the fine-grid patterns become more dissimilar with higher wind speeds?
This makes a bit more sense? If the coarse-grid bin mean wind speed is 0-5, I find all the matching coarse-grid winds within that bin and their corresponding fine-grid pattern. It is the difference between these fine-grid patterns that I was to statistically assess.
That would probably require some pre-processing. The value you may want to test could be the slope of a linear regression as a function of wind speed. Parameters such as slopes have their own uncertainty measures, so those may be what you want to compare. If the slopes are not significantly different from zero, you would perhaps be using the confidence intervals of the slopes. If they included zero, the slope is not significant.
And with that, we have exhausted my knowledge of the sort of statistical approach you need. I have no idea what statistical test you would use to test the confidence intervals of a group of slopes that would also tell you what specific slopes were most different.
Not a problem! Definitely an accepted answer nonetheless and thank you for all your help and input
Star Strider
Star Strider le 1 Juin 2016
Modifié(e) : Star Strider le 3 Juin 2016
As always, my pleasure!
EDIT (2016 06 03 at 18:00 UCT)
I thought about this a bit more, and came up with the idea of taking the confidence limits of the slope for each regression and multiplying them together to create a new sort of variable. The non-significant ones will be large and negative, and the highly significant ones will likely be large and positive, with the others in between. I’ve not tested this idea, but it — or something like it — might be worth considering.

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by