When using the "ranksum" function, how can I differentiate bewteen the two options of getting a low p-value?

Say I have two vectors- A and B. If I get a low p-value, I would like to check whether this low p-value stands for high median of B compared to A, instead of just different medians (that is, either the median of B is higher than the median of A, or vise versa).
For example:
A=[120 10 201 20 30 12 30 10 2 2 3 5 1]
B=[140 400 120 2000 30 40 2000 1000 1000]
I get a p-value of 7.2251e-004.
But if
B=[1 0 0 0 0 0 0 0 0 0 0 0 0]
I also get a low p-value (6.4360e-006).
I would like to get only low p-values when B median is higher then A median. Since I have many calculations, I need it to be automatically in the code, instead of checking every pair of vectors. Do you have any idea how to do that?
Thanks, Michal

 Réponse acceptée

Both ‘one-sided’ (that one median is greater than or less than the other) and ‘two-sided’ (that the medians are different) options are possible. See Item #4 under Assumptions and formal statement of hypotheses in the Wikipedia article on the Mann–Whitney U. There is also an excellent discussion of this on page 3 of The Wilcoxon Rank-Sum Test.
According the the documentation, ranksum returns the two-sided p-value, so make the appropriate calculation to get the one-sided p-value.

4 commentaires

The Wikipedia article didn't exactly help me, but the second reference was very helpful.
If I understood correctly- when I use the following command: [p,h,stats] = ranksum(A,B), the "ranksum" value that I get (in the second field in the stats structure) can help me in the following way:
High ranksum value = H1 : A > B
Low ranksum value = H1 : A < B
Is that correct? If it is correct, then what should be the threshold for differentiating between the two options?
Many thanks!
My pleasure!
When I experimented with this a bit, I discovered that the z-statistic (z-value) — the first ‘stats’ field — may be the answer. When A > B, the z-statistic is (+)ve, and when A < B, the z-statistic is (-)ve.
According to the documentation, ‘ranksum’ only computes the z-value for large samples, so if your samples aren't large enough, I suggest simply comparing the medians.
That's great!
Again, many thanks!!
For people still using matlab version 2012a and older, this solution won't work. If the length of vector y is smaller than the length of vector x, then ranksum calculates based on y rather than x and your z-score will be flipped. You'll need to correct based on the relative lengths of x and y, as well as on the zscore, or just use ranksum.m from matlab version 2012b and later.

Connectez-vous pour commenter.

Plus de réponses (1)

Use the 'tail' option. A careful read of
>> help ranksum
will explain how.
(In the first draft of my answer, I pointed to "doc ranksum" rather than "help ranksum", but it seems that that documentation doesn't list the 'tail' option. Weird.)

4 commentaires

I can't find "tail" option in "help ranksum"..
But thanks for trying to help!
Ah. I am using the prerelease of R2012b. It seems that 'tail' option is new.
I should probably get this release too. Thanks anyway!

Connectez-vous pour commenter.

Catégories

En savoir plus sur Get Started with MATLAB dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by