MATLAB Answers


Indicating statistical significance on boxplot in matlab

Asked by Hari krishnan on 28 Oct 2018
Latest activity Commented on by jonas
on 28 Oct 2018
I am plotting two boxplots with my sample data sets in matlab. I wanted to put a star sign between the boxplots indicating the statistical significance. When i draw this star, its adjusted to one corner rather than between the boxes. I am attaching the boxplot with this. Any help to solve this will be appreciated.
x1 = required_data_threhold_time_for_recruitment_gdnest;
x2 = required_data_threhold_time_for_recruitment_bdnest;
x = [x1 ;x2];
g = [ones(size(x1)); 2*ones(size(x2))];
boxplot(x,g,'Labels',{'Good nest (1 lux)','Poor nest (16 lux)'});
yt = get(gca, 'YTick');
axis([xlim 0 ceil(max(yt)*1.2)])
set(gca, 'Xtick', 1:3);
xt = get(gca, 'XTick');
hold on
plot(xt([2 3]), [1 1]*max(yt)*1.1, '-k', mean(xt([2 3])), max(yt)*1.15, '*k')
hold off


Sign in to comment.

1 Answer

Answer by jonas
on 28 Oct 2018
Edited by jonas
on 28 Oct 2018
 Accepted Answer

It is exactly where you plotted it, why would you expect a different result?
plot(mean(xt([2 3])), max(yt)*1.15, '*k')
so at x=2.5 (as xt = [1 2 3]) and a little bit above the maximum value of yt, which I can only assume to be 5000 (the max ytick was probably 5000 when you created yt).
The reason you are confused is probably because you are using the xticks for determining the location of the plot. What is more confusing is that your xticks extent beyond the axis limits. Basically you have a very simple problem that you try to solve in a very difficult way by involving x- and y-ticks.
If you want it between the boxes, then I can only assume that you should change to:
plot(mean(xt([1 2])), max(yt)*1.15, '*k')
...there are however a lot of assumptions going into this answer, so you may want to clarify.


It worked. I got confused of using xticks. Can i ask a follow up question. If i need to get the P value, for the level of significance, how should i proceed?
There are many ways to calculate the P-value and the most appropriate method depends on the nature of your data. To be honest I'm probably not the right person to ask. The few times I have calculated P-values I have used the monte carlo method, mainly because it is intuitively easy to understand and lacks assumptions.

Sign in to comment.