From: <Adechert_at_aol_dot_com>

Date: Fri Sep 12 2003 - 18:53:05 CDT

Date: Fri Sep 12 2003 - 18:53:05 CDT

In a message dated 9/12/03 3:35:19 PM Pacific Daylight Time,

voting-project@gnosis.cx writes:

*> There is an additional requirement (discussed and agreed earlier) that
*

*> is not reflected in Jan's otherwise quite excellent samples. In order
*

*> to prevent easy visual identification of cast votes via the barcodes, we
*

*> will pad the position of actual vote data by a random amount, per ballot.
*

*>
*

Respectfully, David, the scheme we agreed was good (posed by Arthur) also

included a scheme for compression.

http://gnosis.python-hosting.com/voting-project/August.2003/0237.html

Jan's proposal does not use compression. Padding with many more symbols will

make a very long barcode and we don't know what readers will be able to read

it accurately. Jan is encoding the 35 digits (plus leading zero) with 22

symbols. It's already about 2.5 inches long. If we increase that to 40 symbols

(60 digit or so decimal number) plus 2 more for the ballot number, that's may

be a barcode of 5 inches or more. From my research on this topic, a barcode

this long will be a problem.

Jan's work proves that we can get by for the demo without compression.

However, if we add the padding feature (to help ensure non-human readability and

establish a fixed length for the barcode), we may also need the compression.

The 116-bit string will be highly compressible since only a tiny fraction of

possible 116-bit strings can result from the pattern of selections on our

sample ballot.

If someone could write a function using Arthur's compression scheme that

would take the 116-bit string (or any length up to 500 or so) and return a short

string of symbols (and, of course, give us a function to decode the compressed

string), then, presumably, we'd have a very short string of symbols (less than

10 certainly) and then you can add lots of padding without making an

excessively long barcode.

If no one has the time to write the compression (and decompression) function,

then before we add the padding, we might want to check to see if anyone can

eyeball the barcode and figure anything out -- especially after adding two

leading symbols representing the ballot number (a random 4-digit number).

Before we go for the non-compressed padding, maybe we could do a test. Say

we give Jan 10 different electronic ballot images (i.e., ten 116-bit strings

that could result from voting on our example ballot). Let Jan add random

4-digit ballot numbers (encoded with two leading symbols) then give us the ten

barcodes. Let's see if anyone could match up the barcodes with the ballots just by

looking at them. If no one can do a decent job of matching them up then we

should not worry about the padding, for the demo at least.

If someone(s) can figure out which is which by looking at the barcodes, then

we could also look at some other alternative that would not increase the

length of the barcode.

For example, we could vary the starting point for the string of symbols and

allow it to wrap. Let's say the 24 symbols were (22 for selections and 2 for

ballot number):

YZABCDEFGHIJKLMNOPQRSTUV

Where YZ represents the ballot number and the rest represents the selections.

We could confuse the eyeballer by starting in a different place... like so:

JKLMNOPQRSTUVYZABCDEFGHI

and we could add a symbol that tells where the string really begins.

Say we add N which represents the position the string is really supposed to

start at;

NJKLMNOPQRSTUVYZABCDEFGHI

So now the barcode for identical ballots will look different but it will

still be short enough without using compression.

Note that even with the non-compressed string that Jan proposes, we still

need to be able to convert a long binary number to a long (35 digit) decimal

number. Do we know if this capability is readily available in python?

*> It looks like Jan's sample encodes about 36 digits of information. We
*

*> can probably pad that within the size of the ballot to 60 or 70 digits.
*

*> The first two digits will simply indicate the offset to get to the
*

*> real data. So, for example, Jan's data is:
*

*>
*

*> 083076749736557242056487941267521536
*

*>
*

*> Two different voters who vote identically will have on their ballot
*

*> distinct encoded strings, such as:
*

*>
*

*> 13ddddddddddddd083076749736557242056487941267521536dddddddddddddd
*

*>
*

*> and
*

*>
*

*> 21ddddddddddddddddddddd083076749736557242056487941267521536dddddd
*

*>
*

*> (or rather, the barcode version of these).
*

*>
*

*> Where the 'd's are random decimal digits. I am confident that that is
*

*> sufficient to prevent elections workers who see numerous exposed edges
*

*> during a day from beginning to recognize vote patterns.
*

*>
*

I am confident of that too. I am less confident that we know that our

barcode readers will be able to handle very long barcodes without problems.

Alan D.

==================================================================

= The content of this message, with the exception of any external

= quotations under fair use, are released to the Public Domain

==================================================================

Received on Tue Sep 30 23:17:03 2003

*
This archive was generated by hypermail 2.1.8
: Tue Sep 30 2003 - 23:17:09 CDT
*