Huffman Dictionary, error with index exceed when i give specific probabilities.

5 vues (au cours des 30 derniers jours)
When i change my probabilities numbers i receive an error.
Also if i change some of my probabilities im not getting the error.
Index exceeds the number of array elements. Index must not exceed 1.
Error in huffmandict (line 17)
min1 = sorted_p(2);
The probabilties are the frequencies of each english letter.
Input Data:
probabilities=[0.0698 0.0128 0.0238 0.0364 0.1086 ...
0.0190 0.0172 0.0521 0.0595 0.0013 ...
0.0066 0.0344 0.0206 0.0577 0.0642 ...
0.0165 0.0008 0.0512 0.0541 0.0774 ...
0.0236 0.0084 0.0202 0.0013 0.0169 0.0006 0.1453];
alphabet = {'a' 'b' 'c' 'd' 'e' 'f' ...
'g' 'h' 'i' 'j' 'k' 'l' ...
'm' 'n' 'o' 'p' 'q' 'r' ...
's' 't' 'u' 'v' 'w' 'x' 'y' 'z' '-'};
dict = huffmandict(alphabet, probabilities);
My huffman dictionary script
function dict = huffmandict(symbols, prob)
%Probability Length
probability_length = length(prob);
for i = 1:probability_length
code{i} = '';
symbol{i} = i;
end
while(prob ~= 1)
% Sorting probabilities
[~,sorted_p] = sort(prob);
% Indexes to be merged
min = sorted_p(1);
min1 = sorted_p(2);
min_set = symbol{min};
min1_set = symbol{min1};
% Get the sum of the probabilities
new_possibility = prob(min) + prob(min1);
merged_set = [min_set, min1_set];
% Probability updates
symbol(sorted_p(1:2)) = '';
symbol = [symbol merged_set];
prob(sorted_p(1:2)) = '';
prob = [prob new_possibility];
% Assigning 0 and 1 to symbols
for i = 1:length(min_set)
code{min_set(i)} = strcat('1',code{min_set(i)});
end
for i = 1:length(min1_set)
code{min1_set(i)} = strcat('0',code{min1_set(i)});
end
% Constructing the dictionary
dict = cell(probability_length,2);
for i=1:probability_length
dict{i,1} = symbols{i}(1);
dict{i,2} = code{i};
end
end
end

Réponse acceptée

David Goodmanson
David Goodmanson le 6 Jan 2022
Modifié(e) : David Goodmanson le 6 Jan 2022
Hi Rafael,
sum(probabilities)
ans = 1.0003
SInce the probabilities don't add up to 1, the code is going to have problems. I arbitrarily reduced the last probability by .0003, so now the last line is
0.0236 0.0084 0.0202 0.0013 0.0169 0.0006 0.1450]
Now
sum(probabilities)
ans = 1
and it works. However, you are using a check that reads
while(prob ~= 1)
and depends on the sum equaling 1 exactly. When adding floating point numbers this will only happen by luck. For example, suppose you change the last line to
0.0239 0.0081 0.0202 0.0013 0.0169 0.0006 0.1450]
This has increased the first element by .0003 and decreased the second element by .0003. The sum is still 1, but
sum(probabilities)-1
ans = 2.2204e-16
so the exact 'while' check fails. That check needs to be changed to something like
while abs(prob-1)>tol
where tol is some suitable tolerance such as 1e-10.
  4 commentaires
Rafael Nicolaou
Rafael Nicolaou le 11 Jan 2022
Modifié(e) : Rafael Nicolaou le 11 Jan 2022
Hello sorry for the delay got tested postive for covid.
After changing the probabilities anything above 1 + 1e-8 is not working.
If i do anything smaller than 1e-9 works perfect.
The problem is my probabilities are standar and i cant change them.
David Goodmanson
David Goodmanson le 11 Jan 2022
Modifié(e) : David Goodmanson le 12 Jan 2022
Hi Rafael, If you have a list of probabilities and indend to use every one of them even if they don't add up to exactly 1, you could temporarily rescale them so that they do add up to 1, within something on the order of 10^-16. That doesn't change their relative sizes so the huffman code is unaffected.
probabilitiesnew = probabilities/sum(probabilities)
Then a tolerance of 1e-10 should work.

Connectez-vous pour commenter.

Plus de réponses (1)

Cris LaPierre
Cris LaPierre le 6 Jan 2022
The error is occuring the 27th time the while loop runs. prob only has a single value at this point, so sorted_p(2) does not exist.
A=5;
% works
A(1)
ans = 5
% Doesn't work
A(2)
Index exceeds the number of array elements. Index must not exceed 1.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by