How to store sequence of bits as a bit stream and use the least possible memory ?
19 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I am implementing a huffman encoding procedure which for example encodes the decimal number -5 as '100010'. Until now I have implemented some code which correctly produces the encoded sequence. The encoded sequence is stored as a string. However, I want the bit sequence to be stored as a bit stream and use only the neccesary amount of space for storage. For example, I want the encoded sequence for -5, which is '100010', to take 6 bits of memory. Is that possible ?
1 commentaire
Guillaume
le 10 Jan 2020
Note that in all modern processors and certainly all that run matlab (x86/x64) the minimum unit of memory is 8 bits and everything is in multiple of 8.
Réponses (3)
Matt J
le 10 Jan 2020
Modifié(e) : Matt J
le 10 Jan 2020
You could chain all your strings together as a logical column vector and use bwpack, see
Example:
>> A='10001010001010001011010100011000100010100010';
>> B=bwpack((A-'0').');
>> whos A B
Name Size Bytes Class Attributes
A 1x44 88 char
B 2x1 8 uint32
0 commentaires
Stephane Dauvillier
le 10 Jan 2020
You can use the functions:
- dec2bin to tranform decimal number into binary,
- string to transform the output of the dec2bin from char to string
- regexprep to replace the first 0 by empty
- strjoin to join everything
For instance:
strjoin(regexprep(string(dec2bin([66,8,19,7])),"^0+","","once"),"")
0 commentaires
Guillaume
le 10 Jan 2020
A bit similar to Matt's suggestion, but may be slightly more memory efficient since bwpack uses multiples of 32-bit, is to:
- split your stream into block of 8-bits
- pad the last block to 8 bits if shorter than 8.
- convert each block to uint8, the smallest unit of storage on x86/x64
eg.
bitstream = char(randi(double('01'), 1, 19)) %random stream of 19 0-1 characters
encoded = bitstream - '0'; %convert to binary vector. Uses 4 times as much memory as bitstream
encoded = [encoded, zeros(1, mod(-numel(bitstream), 8))]; %pad with zeros at the to multiple of 8
encoded = uint8(sum(reshape(encoded, 8, []) .* 2.^(7:-1:0)')) %convert each sequence of 8 bits into 8-bit integer. 1st bit is MSB
You can easily convert back to the (padded) bitstream:
padded_origstream = reshape(dec2bin(encoded, 8)', 1, [])
0 commentaires
Voir également
Catégories
En savoir plus sur Large Files and Big Data dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!