Hi,
I have been testing the use of kstest for the detection of a discrete uniform distribution. However, I believe that I am encountering an error, or using the function incorrectly. For example, if I define a variable array
x = randi([1 4],1900,1);
where tabluate gives the following very uniform distribution
tablate(x)
Value Count Percent
1 449 23.63%
2 482 25.37%
3 482 25.37%
4 487 25.63%
and I then run the test
kstest(x, 'CDF', [x unidcdf(x,4)])
I get a result of h = 1, i.e. rejection of the hypothesis that x is discrete uniform, which is clearly not the case (at least in my eyes). Would someone with more experience with this test potentially be able to helpfully provide an explanation as to why I'm getting this result? And whether I'm doing something wrong?
Many thanks.

 Réponse acceptée

Jeff Miller
Jeff Miller le 12 Oct 2021

1 vote

you can test for the fit of a discrete distribution, including uniform, with chi2gof. One of the examples (about 1/3 of the way down the page) shows how to test for a Poisson distribution. With the uniform discrete, your expected counts expCounts are just the total number of observations divided by the number of possible discrete values.

3 commentaires

John Smith
John Smith le 13 Oct 2021
Modifié(e) : John Smith le 14 Oct 2021
Thank you Jeff, that seems to do the trick well. For others that might have this problem in the future, here is the code that worked for me:
%Get a discrete uniform distribution
x = randi([1 4],1900,1);
%Create expected counts variable, which is repeated 4 times as that is the
%number of discrete values
expCounts = repmat(numel(x)/4,[4,1]);
%run the chi2gof test
[h,p,stats]=chi2gof(x,'Ctrs',[1 2 3 4],'Expected',expCounts)
%These are the results I got
h =
0
p =
0.5865
stats =
struct with fields:
chi2stat: 1.9326
df: 3
edges: [0.5000 1.5000 2.5000 3.5000 4.5000]
O: [449 482 482 487]
E: [475 475 475 475]
As h = 0 the null hypothesis, which is that x is discrete uniform, is not rejected. I will be looking further into why the chi2 metric is suitable and the KS one is not, as this has been quite helpful!
Jeff Miller
Jeff Miller le 14 Oct 2021
Just a minor correction of the terminology at the end: the null hypothesis is that X is a discrete uniform, and h=0 means that this null hypothesis should not be rejected based on the observed X values.
John Smith
John Smith le 14 Oct 2021
Thank you, I appreciate the correction, and have edited the reply to avoid anyone reading this making the same mistake.

Connectez-vous pour commenter.

Plus de réponses (1)

the cyclist
the cyclist le 12 Oct 2021

0 votes

From the documentation: "The one-sample Kolmogorov-Smirnov test is only valid for continuous cumulative distribution functions." (Emphasis added.)

1 commentaire

John Smith
John Smith le 13 Oct 2021
Thank you, I missed that part on the first read. I suspect Jeff Miller's answer is the way to go then, as chi square will allow for a test of discrete distributions.

Connectez-vous pour commenter.

Produits

Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by