Re: Compression, encoding, entropy

From: charlie strauss <cems_at_earthlink_dot_net>
Date: Sun May 02 2004 - 14:14:02 CDT

On May 2, 2004, at 2:19 AM, charlie strauss wrote:
>> As someone who actually got their PHd doing information theory I'm
>> really bewildered by the barcod length thread

>Me too.

I sounded sort of arrogant there. Yikes. wasn't trying to. just bewildered that people seemed to be unconvinced by your (Dave Mertz) statements on the thread. I was essentially seconding all your statements on the thread not disputing them.

>> Second, lossless compression cannot reduce the maximum message size in
>> fact it will only increace it, always.

>Almost. As a general principle this is true, but it assumes that all
>messages are possible as encodings.

I totally agree. I sort of got my points out of order. once you have an encoding that doesn not include non-existant choices then there is no further compression of the maximal message length to be had. You pointed this out with the ascii example, so I'm just agreeing with you

> >third, dont confuse compression with optimal encoding. you can encode
>> the vote in such a way--not using any compression--that its maximum
> >size is a minimal. Namely, here is an optimal receipe--dont bother
> >looking for a better one: enumerate every possible ballot and assign
> > a ordinal number too it.

>That's basically what I will do (I have the mentioned script sketched
>out mentally, I just need to write it down). But as I indicated, I
>will enumerate per-contest, thereby letting each contest fall on a
>bit-boundary in its representation. I feel that has enough debugging
>advantages to warrant the extra 1/2-bit per contest it will require, on
>average, for the encoding. Plus it makes it WAY easier to program in
>the first place (and for someone else to program a compatible
>reimplementation--e.g. for a BVA).

That's how I'd do it too. As for lookup tables that was intended to be pedagogical existance proof. In most cases, esepcially with single contest encoding, the mapping can be done alrgorithmically rather than using a look-up table. (btw, while it makes no real sense to do so, you could still in principle represent a 8x8 ranked preference with lookup tables: just use a product state for the first four and last four races. this would not be an optimal encodeing but would approximate it feasibly)

>> Fifth, and I've mentioned this before, as part of the error correction
>> scheme, consider writing error correction data from one ballot on
>> other ballots.

>Nope. I still violently disagree with this. For newcomers, this was
>hashed out before on the list.

I have of course read your objections to this and agree with them in general. While it's obvious that serial numbers and other partial order information offer expedients to solving security issues with paper ballots, this expeident is severely outweighed by the general principle of voter annonimity that you have championed. On the otherhand, before dismissing this as a non-starter I'd like to introduce a concept that might be considered acceptable. Currently there are certain points in the OVC process where ballot order information is deliberately erased. The first is in the voting machine, where the average voter is taking it on faith that the machine is not internally keeping a time stamp with his vote. (open source review is used to assure this). The second is when the ballot box is opened and the ordered ballots shuffled before observers. So it should be apparent that OVC requires a certain measure of trust in the system to assure that these two safegaurds function and order information is erased
properly. In otherwords the safegaurds are strictly admisitrative solutions not engineering solutions and this is considered acceptable. I would venture to propose therefore that order information could be safely retained as long as it was encoded and that the keys to decode it were held privately by some trusted set of parties. (e.g. an adversarial group of party representatives, county officials, and randomly chosen designates...or whatever, none of whom could recover the info without official due process and mutual agreement). This is just as administratively secure as the other points where order information is protected by OVC. As a concept one could propose that in the event one wanted to unlock the encoded information the computer that would do this would, using open source, decode the encoded information in bulk, randomize the order of the ballot checksums, then erase any order information it had before terminating.
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Mon May 31 23:17:03 2004

This archive was generated by hypermail 2.1.8 : Mon May 31 2004 - 23:18:15 CDT