From: Chris Schaefer <evm_at_1reality_dot_org>

Date: Sat Sep 13 2003 - 15:29:17 CDT

Date: Sat Sep 13 2003 - 15:29:17 CDT

Architectural document is a HUGE improvement. Thank you.

here are some of my thoughts:

I started by trying to find the real amount of information ( ie the

minimum size we could compress to )

For each item on the ballot I see the following 4 types of responses:

i) YesOrNo - requires 2 bits or log2(3) ( remember lack of

choice needs to be represented )

ii) PickOne - requires log2(M+1) bits where M is

number of choices

iii) PickNofM - requires N * log2(M+1) bits where M is number of

choices and N is number to pick

iv) Rank - requires M * log2(M+1) bits where M is number

of choices to rank.

(note all fractional bits are rounded UP )

(Obviously a write-in situations complicates things a bit on the bit

counting. But, we can encode the standard 26 letters plus a space in

5 bits so that's 5 bits X #letters )

For purposes of discussion I have separated out PickOne and YesOrNo

from PickNofM just for consideration in vote count systems, even

though it's clear that one is a subset of the other. Further you

might argue that a single rank item Rank is a composed of a sequence

of PickOne items ( where you pick a single position for each

candidate ).

So I'm getting somewhere around 72 bits of data to represent our

sample ballot without write-in text. I don't believe any compression

scheme will find much to squeeze out of that. It's important to

remember that compression system will not work well on very short

lengths of data like ours. They're designed for kBytes at a minimum,

not

segments of 50 bytes or so. The best way to compress this data is to

represent it in as compact a fashion as possible.

Now Jans encoding scheme is to move the 72 bits representation as a

single _VERY_ large binary number to decimal, and then use the code

128 decimal scheme to represent that. Before I continue, let me

just say as a programmer that dealing with weird bit-aligned fields

in data is very difficult and error prone. In the interests of

clarity my inclination would be to move to byte aligned data, or

perhaps nibble (4 bit) aligned data.

In this fashion each PickOne field would be represented by either a

nibble for most election or a byte for exceptional election ( like

our recall governor race here in CA ). That would give us room for

14 candidates, a write-in and a no-mark in the case of a nibble or

253 candidates, a write-in and a no-mark in the case of the byte.

If we go the nibble route we can represent the sample data in 22

nibbles or 11 bytes. (88 bits for those of you who are counting

bits ).

A final option would further increase the size slightly but eliminate

the need for large integer math. Basically store up to 8 candidates

and a write-in and a nomark. This would be a single decimal digit.

The expanded version would allow for 98 candidates ( yuck, doesn't

fit CA election ).

Each of these proposals could utilize the XOR visual obfuscation method.

Before we continue to much further can we find out some salient info

1) How many ballot items will we need to represent?

2) How long a barcode can we tolerate?

These will allow us to determine how much additional info ( like

ballot item # , and vote type ) we can include along with the vote

data.

-C-

-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Chris Schaefer Email: chris<AT>1reality(DOT)org Professional Bit Twiddler and student of reality. "Global Information, Local Production" -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= ================================================================== = The content of this message, with the exception of any external = quotations under fair use, are released to the Public Domain ==================================================================Received on Tue Sep 30 23:17:04 2003

*
This archive was generated by hypermail 2.1.8
: Tue Sep 30 2003 - 23:17:09 CDT
*