I would like to have 30 series of data in three different distribution similarities. I mean, if I have x1 and x2 with two distinct distribution, the kstest2 will recognise that they have not the same distributions.
x1 = wblrnd(1,1,50,1);
x2 = gamrnd(1,1,[50 1]);
[h,p,k] = kstest2(x1,x2)
But when I extend it to these lines it shows different results:
x1 = wblrnd(1,1,50,10);
x2 = gamrnd(1,1,[50 10]);
x3=[x1,x2];
for i=1:20
for j=1:20
[h(i,j),p(i,j),k(i,j)] = kstest2(x3(:,i),x3(:,j));
end
end
In my minde the Weibul distribution vectors have to h=1 and it is the same for Gamma dist. vectors. However, the hypothesis shows something different.
Now, the point is, how it is possible to have 10 series with low (around 0.2) , 10 series with medium (around 0.5) and 10 series with high (around 0.8) KS statistics. In total, I want to see cdf of these 30 series together to figure out how is the grouping data based of distribution similarities. Naturally when the Kolmogorov - Smirnov statistics are near zero, the distribution of two sets of data are similar and maybe they are in a cluster.
0 Comments
Sign in to comment.