problem with binary code
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
good I ask a question that has nothing to do at the moment with programming in Matlab, but with statistical issues and wonder if anyone can help,
My purpose is predicting the following number of binary string. For this, I have a sequence of binary digits that is:
s= 1(1) 0(2) 1(3) 1(4) 0(5) 0(6) 1(7) 0(8) 0(9) 1(10) 1(11) 1(12) 1(13) 0(14) 0(15) 0(16) 1(17) 1(18) 1(19) 0(20).
What I did then is creating substrings produced as follows:
1) 1 (1) 0 (2) 1 (3) 1 (4) --- [1 0 1 1]
2) 1 (1) 1 (3) 0 (5) 1 (7) --- [1 1 0 1]
3) 1 (1) 1 (4) 1 (7) 1 (10) --- [1 1 1 1]
4) 1 (1) 0 (5) 0 (9) 1 (13) --- [1 0 0 1]
5) 1 (1) 0 (6) 1 (11) 0 (16) --- [1 0 1 0]
6) 1 (1) 1 (7) 1 (13) 1 (19) --- [1 1 1 1]
7) 0 (2) 1 (3) 1 (4) 0 (5) --- [0 1 1 0]
8) 0 (2) 1 (4) 0 (6) 0 (8) --- [0 1 0 0]
9) 0 (2) 0 (5) 0 (8) 1 (11) --- [0 0 0 1]
10) 0 (2) 0 (6) 1 (10) 0 (14) --- [0 0 1 0]
11) 0 (2) 1 (7) 1 (12) 1 (17) --- [0 1 1 1]
12) 0 (2) 0 (8) 0 (14) 0 (20) --- [0 0 0 0]
13) 1 (3) 1 (4) 0 (5) 0 (6) --- [1 1 0 0]
14) 1 (3) 0 (5) 1 (7) 0 (9) --- [1 0 1 0]
15) 1 (3) 0 (6) 0 (9) 1 (12) --- [1 0 0 1]
16) 1 (3) 1 (7) 1 (11) 0 (15) --- [1 1 1 0]
17) 1 (3) 0 (8) 1 (13) 1 (18) --- [1 0 1 1]
18) 1 (4) 0 (5) 0 (6) 1 (7) --- [1 0 0 1]
19) 1 (4) 0 (6) 0 (8) 1 (10) --- [1 0 0 1]
20) 1 (4) 1 (7) 1 (10) 1 (13) --- [1 1 1 1]
21) 1 (4) 0 (8) 1 (12) 0 (16) --- [1 0 1 0]
22) 1 (4) 0 (9) 0 (14) 1 (19) --- [1 0 0 1]
23) 0 (5) 0 (6) 1 (7) 0 (8) --- [0 0 1 0]
24) 0 (5) 1 (7) 0 (9) 1 (11) --- [0 1 0 1]
25) 0 (5) 0 (8) 1 (11) 0 (14) --- [0 0 1 0]
26) 0 (5) 0 (9) 1 (13) 1 (17) --- [0 0 1 1]
27) 0 (5) 1 (10) 0 (15) 0 (20) --- [0 1 0 0]
28) 0 (6) 1 (7) 0 (8) 0 (9) --- [0 1 0 0]
29) 0 (6) 0 (8) 1 (10) 1 (12) --- [0 0 1 1]
30) 0 (6) 0 (9) 1 (12) 0 (15) --- [0 0 1 0]
31) 0 (6) 1 (10) 0 (14) 1 (18) --- [0 1 0 1]
32) 1 (7) 0 (8) 0 (9) 1 (10) --- [1 0 0 1]
33) 1 (7) 0 (9) 1 (11) 1 (13) --- [1 0 1 1]
34) 1 (7) 1 (10) 1 (13) 0 (16) --- [1 1 1 0]
35) 1 (7) 1 (11) 0 (15) 1 (19) --- [1 1 0 1]
36) 0 (8) 0 (9) 1 (10) 1 (11) --- [0 0 1 1]
37) 0 (8) 1 (10) 1 (12) 0 (14) --- [0 1 1 0]
38) 0 (8) 1 (11) 0 (14) 1 (17) --- [0 1 0 1]
39) 0 (8) 1 (12) 0 (16) 0 (20) --- [0 1 0 0]
40) 0 (9) 1 (10) 1 (11) 1 (12) --- [0 1 1 1]
41) 0 (9) 1 (11) 1 (13) 0 (15) --- [0 1 1 0]
42) 0 (9) 1 (12) 0 (15) 1 (18) --- [0 1 0 1]
43) 1 (10) 1 (11) 1 (12) 1 (13) --- [1 1 1 1]
44) 1 (10) 1 (12) 0 (14) 0 (16) --- [1 1 0 0]
45) 1 (10) 1 (13) 0 (16) 1 (19) --- [1 1 0 1]
46) 1 (11) 1 (12) 1 (13) 0 (14) --- [1 1 1 0]
47) 1 (11) 1 (13) 0 (15) 1 (17) --- [1 1 0 1]
48) 1 (11) 0 (14) 1 (17) 0 (20) --- [1 0 1 0]
49) 1 (12) 1 (13) 0 (14) 0 (15) --- [1 1 0 0]
50) 1 (12) 0 (14) 0 (16) 1 (18) --- [1 0 0 1]
51) 1 (13) 0 (14) 0 (15) 0 (16) --- [1 0 0 0]
52) 1 (13) 0 (15) 1 (17) 1 (19) --- [1 0 1 1]
53) 0 (14) 0 (15) 0 (16) 1 (17) --- [0 0 0 1]
54) 0 (14) 0 (16) 1 (18) 0 (20) --- [0 0 1 0]
55) 0 (15) 0 (16) 1 (17) 1 (18) --- [0 0 1 1]
56) 0 (16) 1 (17) 1 (18) 1 (19) --- [0 1 1 1]
57) 1 (17) 1 (18) 1 (19) 0 (20) --- [1 1 1 0]
And I've also calculated the relative frequency of these substrings
0 0 0 0------ 0,0175438596491228
0 0 0 1------ 0,0350877192982456
0 0 1 0------ 0,0877192982456140
0 0 1 1------ 0,0701754385964912
0 1 0 0------ 0,0701754385964912
0 1 0 1------ 0,0701754385964912
0 1 1 0------ 0,0526315789473684
0 1 1 1------ 0,0526315789473684
1 0 0 0------ 0,0175438596491228
1 0 0 1 0,122807017543860
1 0 1 0 0,0701754385964912
1 0 1 1 0,0701754385964912
1 1 0 0 0,0526315789473684
1 1 0 1 0,0701754385964912
1 1 1 0 0,0701754385964912
1 1 1 1 0,0701754385964912
Now let's say I want to know if the number 21 of the succession will be "0" or "1". To do this do the following:
s= 1(1) 0(2) 1(3) 1(4) 0(5) 0(6) 1(7) 0(8) 0(9) 1(10) 1(11) 1(12) 1(13) 0(14) 0(15) 0(16) 1(17) 1(18) 1(19) 0(20) X(21)
and now build substrings that have to do with X:
1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, X]
2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, X]
3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, X]
4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, X]
5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, X]
6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, X]
And replacing the X I have:
X = 1,
1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, 1 ]
2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, 1 ]
3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, 1 ]
4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, 1 ]
5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, 1 ]
6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, 1 ]
X = 0,
1: 1 (18), 1 (19), 0 (20), X (21) --- [1,1,0, 0 ]
2: 0 (15), 1 (17), 1 (19), X (21) --- [0,1,1, 0 ]
3: 1 (12), 0 (15), 1 (18), X (21) --- [1,0,1, 0 ]
4: 0 (9), 1 (13), 1 (17), X (21) --- [0,1,1, 0 ]
5: 0 (6), 1 (11), 0 (16), X (21) --- [0,1,0, 0 ]
6: 1 (3), 0 (9), 0 (15), X (21) --- [1,0,0, 0 ]
Okay, from here someone could tell me how I can study the probability to predict the next number in the string?
10 commentaires
dpb
le 30 Août 2013
Modifié(e) : dpb
le 30 Août 2013
If it's a fair coin, then the P(H) on the (N+1)th trial is still 0.5 whatever the preceding sequence -- even if the preceding N were all T (or H).
If it's not fair, then estimate the actual bias of p (or q). As noted above, there's insufficient evidence on the result of the number of H above to conclude that p~=0.5 isn't as good a value as any.
What other information is there to use? The point is that one random realization of a process is subject to randomness such that another realization from the same process could produce the obverse case from what you have above -- namely that
Pobs(1101) ~ 0.05
Pobs(1100) ~ 0.07
instead or some other values entirely. I don't see any reason not to use the expected value for either sequence of 1/16 = 0.0625. Note that your values are scattered on either side of that, additional evidence that the underlying assumption of binomial w/ p=q=0.5 is as good a model as any.
ADDENDUM: The above 1/16 and the observed values roughly equal correlate w/ the other note just posted that basically implies that despite the selection of other than the sequence samples as they arrived the generating process looks w/ this limited sample as though it is pretty much a fair binomial with p=0.5.
dpb
le 30 Août 2013
Another comment on the "probabilities" you've calculated from sequences. If the process is one of generating a sequence, then the observed sequence from the process is not represented by sequences other than those from n:m. The selection of arbitrary subsequences such as many of those you've listed above are not actual sample sequences unless the previous assumption you've claimed is so about there being serial dependence is violated as you've arbitrarily selected samples with different steps between samples.
All I can see that are valid observations if you have reason to look at four subsequent samples are the 16 that you can construct from 1:4, 2:5, ..., 16:20. Everything else is dependent upon there being no correlation at all from one to the other to be a valid sequence (which seems to violate the earlier assertion regarding the underlying randomness not being purely random).
Réponses (1)
David Sanchez
le 30 Août 2013
In your case, the relative frequency of the sequence:
0 0 1 1 is 0,0701754385964912
Then,it means that if you take your whole sequence as a projection of what's going to happen in the future, that very same relative frequency will be the likelihood (probability) of the value to happen again. The likelihood of 0000 is 0,0175438596491228, the likelihood of 0001 is 0,0350877192982456, and so on.
3 commentaires
dpb
le 30 Août 2013
But, unless the generator is one that is particularly flawed for some reason like the period is very short or it does have a peculiarity of serial correlation of period N or the like that you can exploit, there's not much to be said other than the particular realization gave you that particular case.
If you're just trying to study a given generator, as mentioned above look for the NIST battery of tests for randomness for ideas on how and what is tested for in general.
Perhaps if you tried to outline the end objective of where you're headed as a final result rather than focusing on the mechanics it would lead to a better response but as is I just don't see what good this is going to do you to look at it this way unless it is trying to qualify the PRNG.
Voir également
Catégories
En savoir plus sur Hypothesis Tests dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!