How to calculate the conditional probability of an event?
17 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Myriam Moss
le 23 Avr 2021
Réponse apportée : William
le 25 Avr 2021
I have an array similar to this array = [A A B A C A B B B C C A A C]. I want to calculate p(C|A), p(C|B), p(C|C). How can I do this just having this information? I want to know what is the probability of C happening after a previous event A, B or C.
0 commentaires
Réponse acceptée
William
le 23 Avr 2021
Hello Myriam. It is not clear whether A, B, and C here are text characters, or if they have numeric values. If we assume they have numerical values, like A=1, B=2 and C=3, then you could use
y = diff(array);
P_AC = sum(y==2);
P_BC = sum(y==1);
P_CC = sum(y==0);
1 commentaire
Plus de réponses (2)
William
le 25 Avr 2021
Actually, I believe that p(C|A) would be:
y = strfind(array, 'A');
N_A = length(y);
p_CA = N_AC/N_A;
There is one further thing to consider, though. It may be true that the very last element of array is an 'A'. I don't think this should be counted in N_A because we don't know whether it would have been followed by a 'C' or not. So, if the last element of array is 'A', we should reduce N_A by 1.
y = strfind(array, 'A');
N_A = length(y);
if y(end)==length(array) || y(end)==length(array)-1 % The string might end
N_A = N_A - 1; % with an 'A' or an 'A '
end
p_CA = N_AC/N_A;
0 commentaires
William
le 25 Avr 2021
Myriam,
If A, B and C were variables with the values 1, 2 and 3, then in your example:
array = [1, 1, 2, 1, 3, 1, 2, 2, 2, 3, 3, 1, 1, 3]
The diff() function returns the difference between each value and the next value, so
diff(array) = [0, 1, -2, 2, -2, 1, 0, 0, 1, 0, -2, 0, 2]
Every time an A is followed by a C, the difference is 2. Every time a B is followed by a C, the difference is 1. So, I was suggesting that you count the number of times A is followed by C by counting the number of times that the value 2 appears in diff(array) with a statement like c = sum(diff(array) == 2). Unfortunately, I see now that this does not work correctly for the number of times B is followed by C, because this results in a value of 1 in diff(array), and a value of 1 is also produced when an A is followed by a B.
Since you have said that A, B and C are characters, I assume that you mean that:
array = 'A A B A C A B B B C C A A C';
In this case, maybe a better solution would be:
y = strfind(array, 'A C');
N_AC = length(y);
y = strfind(array, 'B C');
N_BC = length(y);
Voir également
Catégories
En savoir plus sur Dimensionality Reduction and Feature Extraction dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!