function [Kraft_vector, optimum] = unsorted_limited_kraft( histogram, bit_limit ) % unsorted_limited_kraft: generate Kraft vector with restriction % on maximum bit length. % [Kraft_vector, changed] = limited_kraft( histogram, bit_limit ) % Builds the optimum Kraft vector for this histogram, % such that max(Kraft_vector) < bit_limit. % % The "optimum" flag is set to 1 when this Kraft vector % is a Huffman code; i.e., when % the same Kraft vector would be generated % (by unsorted_kraft) even without this restriction. % % The "optimum" flag is cleared to 0 when this Kraft % vector does not generate a Huffman code; i.e., % when a Kraft vector of bit_limit or longer % would have been better for compressing your data. % % Usage: % plaintext = [7 7 7 7 4 4 4 3 2 1 1 7]; % histogram = histo(plaintext(:)); % optimum_Kraft_vector = unsorted_kraft( histogram ) % constrained_vector = unsorted_limited_kraft( histogram, 4 ) % optimum_length = histogram' * optimum_Kraft_vector % constrained_length = histogram' * Kraft_vector % returns % optimum_Kraft_vector = [ 0 3 4 4 2 0 0 1 ]' % constrained_vector = [ 0 3 3 3 3 0 0 1 ]' % % optimum_length = 25 % constrained_length = 26 % % The JPEG standard, for example, requires that % each code be shorter than bit_limit = 17 bits. % How does JPEG do it ? % % See also UNSORTED_KRAFT, HUFFMAN_COMPRESS. % 1999-07-10:DAV: David Cary started. % Before we get started, make sure % it's actually possible to construct a codebook % without overflowing the bit_limit. max_allowed_symbols = 2^(bit_limit-1); symbols_used = length(find( histogram )); % ASSERT( symbols_used <= max_allowed_symbols ) if( symbols_used <= max_allowed_symbols ) else max_allowed_symbols symbols_used error('I''m sorry Dave. I don''t think I can do that.') end; optimum_Kraft_vector = unsorted_kraft( histogram ); if( max(optimum_Kraft_vector) < bit_limit ) optimum = 1; Kraft_vector = optimum_Kraft_vector; else optimum = 0; warning('non-optimal Kraft vector generated.') % avoid modifying Kraft value % for items that never occur. I = find(0 == histogram); histogram(I) = +inf; % Starting with optimum Kraft vector, % tweak until it meets the bit_limit criteria. new_kraft_vector = optimum_Kraft_vector; while( bit_limit <= max( new_kraft_vector ) ), % find all symbols that could be made longer short_symbol = find( new_kraft_vector < bit_limit-1 ); % pick the least-frequently-occuring symbol % of these short symbols % (to cause minimum disruption). [m, i] = min( histogram(short_symbol) ); chosen_short = short_symbol(i); % find all of the longest symbols problems = find( max(new_kraft_vector) == new_kraft_vector ); % arbitrarily pick any one of the longest symbols, % prune it off (adjusting its symbol length appropriately) % and graft it to the short symbol we picked % (so they are now siblings, with length % one more than the original length of that short symbol). % % Before: % chosen short symbol, of bitlength s % .../ % \ % ... % % ... % .../ % \ chosen long symbol, of bitlength L % \/ % \ % sibling of long symbol, of bitlength L % % After: % chosen short symbol, of bitlength s+1 % / % .../\ % \ chosen long symbol, of bitlength s+1 % ... % % ... % .../ % \ % sibling of long symbol, of bitlength L-1 % % If s+1 is still much shorter than the maximum % bit length, then % the "chosen long symbol" this iteration % will become the "chosen short symbol" next iteration. chosen_long = problems(1); sibling = problems(2); short_length = new_kraft_vector( chosen_short ); new_kraft_vector( chosen_short ) = 1 + short_length; new_kraft_vector( chosen_long ) = 1 + short_length; new_kraft_vector( sibling ) = new_kraft_vector( sibling ) - 1; % every iteration, grab the next-least-frequent symbol in the histogram. % Repeat until we have enough for a binary tree % that will just squeek in under the bit limit. end; Kraft_vector = new_kraft_vector; end; % end unsorted_limited_kraft.m