Bits and what they reveal

From: David Mertz <voting-project_at_gnosis_dot_cx>
Date: Wed May 05 2004 - 13:55:20 CDT

Arthur Keller wrote:
> 36+80 (larger if self-delimiting)+48+16 = 180 bits (at the very least).
> Of course, this doesn't handle IRV elections.

Correct. As Alan pointed out, we realized we don't need padding if we
use obfuscation of votes.

The sum above *DOES* handle IRV elections (or generically, ranked
preference). Or at least it's enough for one or two ranked preference
combined with a pretty good number of single-selection contests. See
all my posts in the "Compression, encoding, entropy" thread.

180 bits is right about at the limit of practical 1-D codes. So yeah,
there are a lot of of ways we might require 2-D. If we need a bunch of
ranked preference contests. If we decide that a global ID is no-go,
and we actually need the date, county etc. spelled out in the barcode.
If we decide we need to sign with a huge digest like SHA.

I'm pretty close to convinced that we should go to a uniform 2-D
barcode. We -are- getting close enough to the limits that a "just in
case" breathing room starts to look appealing. Pending a little more
of an answer about good GPL tools. But even there, Karl's pointer is
good evidence in that direction.

However, there's something else I'd like member to hold in mind:

   The cheapest, fastest and most reliable components of a
   computer system are those that aren't there.
       --Gordon Bell, Encore Computer Corporation

I don't want to "skimp on bits" as Karl warns against. I *do* think
there's a bit of a difference between spaces that are known to increase
(numbers of IP addresses) and things that are size-limited in
principle. An election is in the size limited category... it's not
possible to have it grow unboundedly (or voters would never be able to
complete a ballot).

The reason I don't want to throw in the kitchen sink into the barcode
isn't because of the raw bit length, but because every feature encoded
in that barcode requires source code to process. The more features we
use, the more points of failure we introduce. Arthur tends to point up
opportunities for extra redundancy, but every redundant feature is
another thing that might go wrong. It's not a simple good, there's a
tradeoff.

I only want to include the ECCs, CRCs, signatures, hashes, IDs,
election context flags, etc. that have a good argument to support them.
  It's easy to imagine lots of things that "might be nice" to throw in
there (assuming the expansive bit-space of 2-D barcodes). But every
one of those "nice-to-have" features might have been programmed wrong,
especially in border cases.

Even more than the Gordon Bell warning, however, elections have an
extra reason for KISS. We should be able to justify EVERY bit we put
into the barcode, because every bit potentially leaks information.
There are a *whole lot* of ways to enable statistical attacks against
voter anonymity. Every time we add a new barcode feature, we have to
make a VERY good security argument for why that feature cannot possibly
leak any voter-identifying information.

And ideally, we need to be able to explain that argument to the VOTERS
THEMSELVES. If we have kilobytes of crypto-this and redundancy-that,
it raises a very reasonable red flag in voters minds that OVC (or
corrupt elections workers) might be collecting secrets about the voter.
  I don't want our answer to be "Just trust us, we wouldn't do that!" I
want it to be: "Bits 128-156 are needed for X; bits 157-255 are needed
for Y; and here's how we prove those bits don't contain anything other
than X and Y."

Yours, David...
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Mon May 31 23:17:17 2004

This archive was generated by hypermail 2.1.8 : Mon May 31 2004 - 23:18:15 CDT