Correlation of two variables over time: can this happen?
21 views (last 30 days)
I have a variable x1 and x2 between January 1 till December 31. When I calculate the correlation between Janaury and June it is positive. When I calculate the correlation between July to December it is positive. But the correlation between Janauary and December is negative. Can it happen?
the cyclist on 17 Dec 2020
Yes. This is known as Simpson's Paradox. Here is an example:
x1 = [1 2 3 4 6 7 8 9]';
x2 = [1 2 3 4 -6 -5 -4 -3]' + 0.8*rand(8,1);
% Correlation of first half
% Correlation of second half
% Correlation of entire vector
% Plot it
You can see that the first half and the second half are positively correlated with each other, but if you look at the trend over the entire vector, it is negative.