# Re: Compression, encoding, entropy

From: David Mertz <voting-project_at_gnosis_dot_cx>
Date: Mon May 03 2004 - 21:13:41 CDT

On May 3, 2004, at 9:48 PM, Arthur Keller wrote:
>>>>> from math import e, log, ceil, floor
>>>>> def SD(n):
>> ... """Self-delimited data bit-length requirement
>> ... Let n be the highest number being encoded
>> ... Let d be the number of bits in a digit
>> ... Let b be the base (2**d)
>> ... Let m be one-half of the base
>> ... The number of digits, g = floor (log_m(n) +1)
>> ... The number of bits is d*g
>> ... """
>> ... d = 1/log(2) # 1.443 bits per base-e digit
>> ... m = e/2 # b == 2**(1/log(2)) == e
>> ... g = ceil(log(n)/log(m)) # log_m(n) == log(n)/log(m)
>> ... return d*g
>> ...

> David, your definitions of d and m don't match mine.

I'm just trying to program your description. But apparently I haven't
managed to understand what it is you are describing yet. Does anyone
else, who might help me?

> d is an integer, probably in the range of 2 to 4.
> It's a parameter, not some function of a log.

I don't understand that. When I read "number of bits in a digit" the
only thing that comes to my mind is the number of bits of information
that it takes to represent a digit in a given base. I.e. 1.443
(approx) for base e, or 3.324 for base 10.

I'm not trying to be difficult here, but I have no idea what you mean
by 'd' from your description. Why is it probably in the range 2 to 4?
How would we know what it was?

> Please use an algorithm closer to my formula, and use other (unused)
> variable names for your helper logarithms.

I wish I knew how. I've tried my best to use your identical names.