Bar Code Scheme for Ballot

From: Alan Dechert <adechert_at_earthlink_dot_net>
Date: Thu Aug 21 2003 - 13:47:02 CDT

David made some interesting comments about the bar code. I will take his
last comment first.

> As well, ballots with write-in candidates will presumably
> require considerably longer bar codes than will ones that
> only select standard candidates. Even without recognizing
> specific patterns, it might be obvious THAT someone
> voted for a write-in candidate (which partially punctures
> ballot anonymity).
>
Yes. I've given that some thought. That's partly why I've advocated NOT
barcoding the write-in names.

If we commit to barcoding write-ins, we almost certainly will require a
high-density 2D scheme (can handle 4,000 chars in a small postage stamp
sized code). Our software and printers could probably handle it fine but I
think the readers are too expensive. If at all possible, I think we should
stick with very cheap commodity readers.

With the cheap (less dense) schemes, you're right about write-ins taking
more space -- possibly much more space and creating an overflow situation.
What if the voter has write-ins for ten contests -- and the names are long?
You also cite (correctly, imo) the additional problem of involving a partial
reduction in ballot secrecy.

It's a waste to bar code write-ins (the bar code should indicate true/false
if a write-in was entered but not the name). It is extremely rare that
there are enough write-ins to affect the outcome of a race. The quick
preliminary electronic tabulation will show if there are any such races
(with significant write-ins). In these rare cases, the tally will be
delayed slightly while the write-ins are counted (probably do OCR scan on
the ballots). Even if we bar coded the write-ins (say with the expensive
4,000 char scheme), this does not fully automate counting write-ins since
names will appear with variations.

In some states, no tally is made of the names of write-ins (unless there are
enough to affect the outcome ... i.e., almost never). In other states, they
tally the write-ins regardless. So, you can see that 4 people wrote in
"Donald Duck" for governor....whatever. This is the type of issue we'd look
at in some detail in the larger funded follow-on study, but this it's beyond
the scope of the demo to spend much time on this.

So, for the demo, I'd like to reach an agreement that we will not bar code
write-in names.

> |This will be achieved by the use of a bar code on the long edge
> |of the printout (duplicated on opposite edges -- say within a half
> |inch of the edge).
>
> This raises an anonymity concern in my mind. While people are
> not skilled at reading bar codes, it is not all difficult to memorize
> a simple pattern.
>
> For example, suppose that I wished to vote for G.Bush for the
> first-listed (Presidential) race. After my own ballot was printed, I
> might notice that the exposed edge started with:
>
> || | | ||| ...
>
> And I might notice that someone else's started:
>
> | ||| | | | ...
>
> Even without knowing the barcode system utilized, I can pretty
> much memorize at a glance the difference between those two
> patterns. If I am a poll worker, I have plenty of opportunity to
> see these exposed edges as they are placed in ballot boxes.
>
> From there, I know quickly who votes the same as I do. In fact,
> after I've seen that pattern two is more common than any other
> initial pattern (other than maybe my own pattern one), I can make
> an awfully good assumption that the second thing is a Democrat
> vote, rather than a minor-party vote.
>
For a variety of reasons, I don't think this is significant concern for the
demo. In the larger funded follow on study, we should probably investigate
this question in some detail -- i.e., whether pollworkers might be able to
discern your vote by looking at the bar code. Maybe we could do some
preliminary test with our set up just to make sure before we commit to
releasing the demo, but I seriously doubt there will be a problem.

For one thing, your example illustrates a very sparse bar code. Ours will
certainly be much more dense than that. Look at some of the more dense bar
codes here:

http://www.bizfonts.com/fontpackage/

You'd have to really study them with a magnifying glass to catch a pattern
that you can remember. Pollworkers would likely see the bar code only for a
second or so -- if at all.

Secondly, if the voter and pollworkers follow procedures, the pollworker
will never see the bar code. The ballot is supposed to be taken to the
ballot box face-down in the privacy folder and placed on the ledge in front
of the slot in the ballot box. The folder is removed and the ballot is slid
into the ballot box. No one sees the bar code. In the case of the blind
voter, the pollworker may see the bar code but over the course of Election
Day at any given polling place, there should be few instances where
pollworkers see the bar code. So even if they had eagle eyes and
computer-like brains, there would be little opportunity to figure out the
bar codes.

Thirdly, understanding the encoding scheme is much more complex than you
suggest. For example, the first 10 or so encoded characters will have to do
with things like ballot number, precinct, county, state, etc. So if someone
is looking at your bar code trying to figure out your vote, they have to be
able to pick out where the pattern begins that is applicable to the race in
question -- not likely to be able to do that at a glance.

You would not normally be able to tell how someone voted on a particular
race even if you had the pattern memorized from your ballot. That's because
the encoded character may cover only part of a race or parts of more than
one race. Going by your example, say, with the presidential race in our
sample ballot, you could tell (if you could read and memorize the pattern
AND pick out in the code where that pattern starts) if a voter picked the
same candidate as you, but you could not distinguish between an undervote
and a write-in -- and you still could not say which other candidate was
selected unless you had all seven patterns memorized.

Probably, we will be encoding 7 bits with a single character. In this case,
the possibilities for the first 7 bits are:
1000000
0100000
0010000
0001000
0000100
0000010
0000001
0000000

Each of these patterns would be mapped to a character. The next 7 bits
cover the write-in slot in the presidential race and the first 6 slots in
the US Senate race. So,

1000000 is possible for the next 7, but so is

1100000
and
0100000

So, to know if the first candidate for US Senate was selected, you have to
know that there are two possible patterns for that (and memorize those).
Likewise, to know if a write-in for President was entered.

We will need a table that shows which character is mapped to which
pattern -- this can be arbitrary. That is, 1100000 could be mapped to a
lower case "b" or an upper case "F." For reasons other than the bar coding,
we may also want to make this changeable. Maybe we'd have 128 mapping
schemes that would be indicated with a single 7-bit character.

In any case, I don't think there is much danger of people breaking our
system in the demo because they can look at the bar code and tell how
someone voted. We can check to make sure.

-- Alan Dechert

==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Sun Aug 31 23:17:13 2003

This archive was generated by hypermail 2.1.8 : Sun Aug 31 2003 - 23:17:18 CDT