Re: Compression, encoding, entropy

From: David Mertz <voting-project_at_gnosis_dot_cx>
Date: Mon May 03 2004 - 21:13:41 CDT

On May 3, 2004, at 9:48 PM, Arthur Keller wrote:
>>>>> from math import e, log, ceil, floor
>>>>> def SD(n):
>> ... """Self-delimited data bit-length requirement
>> ... Let n be the highest number being encoded
>> ... Let d be the number of bits in a digit
>> ... Let b be the base (2**d)
>> ... Let m be one-half of the base
>> ... The number of digits, g = floor (log_m(n) +1)
>> ... The number of bits is d*g
>> ... """
>> ... d = 1/log(2) # 1.443 bits per base-e digit
>> ... m = e/2 # b == 2**(1/log(2)) == e
>> ... g = ceil(log(n)/log(m)) # log_m(n) == log(n)/log(m)
>> ... return d*g
>> ...

> David, your definitions of d and m don't match mine.

I'm just trying to program your description. But apparently I haven't
managed to understand what it is you are describing yet. Does anyone
else, who might help me?

> d is an integer, probably in the range of 2 to 4.
> It's a parameter, not some function of a log.

I don't understand that. When I read "number of bits in a digit" the
only thing that comes to my mind is the number of bits of information
that it takes to represent a digit in a given base. I.e. 1.443
(approx) for base e, or 3.324 for base 10.

I'm not trying to be difficult here, but I have no idea what you mean
by 'd' from your description. Why is it probably in the range 2 to 4?
How would we know what it was?

> Please use an algorithm closer to my formula, and use other (unused)
> variable names for your helper logarithms.

I wish I knew how. I've tried my best to use your identical names.
What ARE your formulae?

Does someone else on this list understand what Arthur is trying to
describe? I'm guessing that whatever it is only take a few lines of
code to implement... but I can't figure out which lines those might be.
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Mon May 31 23:17:07 2004

This archive was generated by hypermail 2.1.8 : Mon May 31 2004 - 23:18:15 CDT