Re: Bar Code Scheme for Ballot

From: Alan Dechert <adechert_at_earthlink_dot_net>
Date: Thu Aug 21 2003 - 18:38:07 CDT

----- Original Message -----
From: "David Mertz" <voting-project_at_gnosis_dot_cx>
To: <voting-project@lists.sonic.net>
Sent: Thursday, August 21, 2003 3:19 PM
Subject: Re: Bar Code Scheme for Ballot

> |For one thing, your example illustrates a very sparse bar code. Ours
> |will certainly be much more dense than that.
>
> Sure. But mostly that is just an artifact of my attempt at ASCII
> representation (it's about like postal bar codes). I think I could
> recognize familiar patterns in most (but not all) of the samples pointed
> out.
>
> How about including the following in the document:
>
> The demo will evalulate whether an exposed bar code presents a
> danger of compromising anonymity.
>
Okay.

> I certainly do not think the matter is intractable. It's just a
> question of whether we might need to include some kind of flap of paper
> and/or deliberate obfuscation of the positions of vote choices. Neither
> of those things is inordinately difficult, and quite possibly both are
> unnecessary. I'm just not sure at this point.
>

Let me expand a bit on what I said earlier. Doug Jones and I debated part
of this some months ago (I think Arnie Urken might have been in on the
discussion). Doug didn't want to get into fancy compression schemes, but I
think the use of compression would also help lay your concern to rest.

<<<<<
You would not normally be able to tell how someone voted on a particular
race even if you had the pattern memorized from your ballot. That's because
the encoded character may cover only part of a race or parts of more than
one race. Going by your example, say, with the presidential race in our
sample ballot, you could tell (if you could read and memorize the pattern
AND pick out in the code where that pattern starts) if a voter picked the
same candidate as you, but you could not distinguish between an undervote
and a write-in -- and you still could not say which other candidate was
selected unless you had all seven patterns memorized.

Probably, we will be encoding 7 bits with a single character. In this case,
the possibilities for the first 7 bits are:

1000000
0100000
0010000
0001000
0000100
0000010
0000001
0000000
>>>>>>>>>>>>>>

So, there are 8 possible patterns for the first 7 bits. Now let's look at
positions 8 - 16
00000000
00000001
00000010
00000100
00001000
00010000
00100000
01000000
10000000
10000001
10000010
10000100
10001000
10010000
10100000
11000000
So, there are 16 possible patterns for positions 8 - 16, which means that
there are 128 possible patterns (8x16) for positions 1 - 16 on the ballot.

This means that a single 7-bit character could represent all possible
patterns for the positions 1 - 16. It also means that the first character
encoded on your ballot would only be the same as another voter that voted
for the same presidential candidate if all of the selections in positions
8 - 16 are the same too.

In other words, even if you can figure out the first encoded letter (or
pattern representing it), it does not mean that someone with a different
letter in that postion voted for a different presidential candidate.

You might be able to make some other conclusions by studying the patterns
(if you can actually make them out), but compression would definitely cloud
the picture for you.

The upshot of all of this is that I favor using compression. A similar
compression ratio for the rest of the ballot would mean that all selections
could be encoded with around 8 characters -- certainly less than 10
(compression will get worse later in the ballot because there are more
possible patterns per 7 bits.

Compression would make it harder for the wise-guys trying to figure out the
bar codes and it would cut down on the number of characters we have to put
in the bar code. And, if we want to make it really hard for the wise-guys,
we'd employ a variety of mapping tables so that bar codes will be different
even on ballots with identical votes on them.

For the demo, we can manually construct the compression scheme. For the
larger study, I would want to come up with an algorithm that would
automatically figure out the compression scheme.

Time permitting, we could have a couple of different mapping tables too. In
other words, we'd insert a character in the electronic ballot image (say a
"1", "2", or "3") that indicates which mapping table to use. In production,
we could indicate one of 128 different mapping tables with the one
character.

Alan Dechert

==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Sun Aug 31 23:17:14 2003

This archive was generated by hypermail 2.1.8 : Sun Aug 31 2003 - 23:17:18 CDT