# Re: How big could a bar code be if a bar code couldbe big?

From: Alan Dechert <alan_at_openvotingconsortium_dot_org>
Date: Fri Apr 30 2004 - 18:44:41 CDT

Karl,

> It's my sense that we need to move out of "the demo" thinking
> and begin thinking of the first pilot system. ......
>
Yes.

> And in such a system, I'm beginning to become fairly convinced that
> the one-dimensional bar codes don't have enough space to hold
> enough information.
>
So far, I've heard conjecture on this point. I'm not convinced either way
but I'd like to see calculations based on actual historical ballots (not,
"what if we have a ballot with 100 IRV contests?") that give us data on what
percentage of ballots could have been encoded--with and without
compression--with, say, 150 bits.

Then we need some cost-benefit analysis. If 99.9% of the ballots need less
than 150 bits, that's something worth considering. If it's 80-20, that's
another story--still not obvious if we go with strictly 2-d barcodes. Is
there GPL 2-d software available? I haven't seen any (maybe Jan can get
started on that ;-) ).

How many scanners are really needed and how much do they really cost?
Software-hardware costs are generally moving south so maybe we can make some
projections on future costs of 2-d technology (we know that 1-d technology
costs very little). If 2-d technology is becoming cheap enough fast enough,
then the debate may be moot.

> Compression doesn't always compress - it's a gamble. Sometimes
> compression actually results in more bits than the original.
>
> > As discussed in this OCT thread, we added error detection code...
>
> I haven't seen that in the code. I've seen range and value checks. But I
> don't see anything that resembles a CRC or ECC.
>
Basically, it checks to see if there are overvotes. I know this is not the
kind of code you are talking about. But it's effective because only a small
fraction of possible 40 digit numbers can represent valid ballots (in our
demo.. of course this will vary with other ballots). If you get the BVA
running, try changing one of the digits and you will get an "invalid ballot"
message since the odds are very very remote you can change one and still
have a valid ballot (in testing, you may find you can change the last digit
and still have a valid ballot but in practice this is extremely unlikely to
happen because the digits are encoded in pairs and two errors would have to
occur to get past the scanners own checksum such that the one error cancels
the other while perserving the checksum value.

> (A Hamming ECC code would be a nice touch. Yes, many bar code formats
> have built-in error correction, but having an additional layer could catch
> errors in the encoding/decoding software routines. My own experience
> testing networking software suggests that data encoding/decoding is a
> place where many implementations have undetected flaws.)
>
> In light of the potential of having the bar-code digitally signed - this
> implies a message digest as part of the data represented by the bar code.
> Such digests are typically at least 32 or 64 bits long. An ECC would, I
> believe, be best applied as the outermost wrapper, i.e. done after and
> appended after the message digest.
>
> I see in the archives a note that the maximum reliable bar code length is
> 30 symbols - we are using 24 already. Each symbol gives about 6.5 bits of
> information. So adding a message digest (32-bits) and ECC (8 bits?) means
> about 7 more symbols - bringing us right up to, or even past, the maximum
> and not leaving any room for bigger elections.
>
Maybe or maybe not. I'd still like to see some real data on ballot sizes.
I think most ballots have fewer than 12 contests. I don't really know--this
is only based on some light analysis I did a couple of years ago.

Alan D.
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Fri Apr 30 23:17:28 2004

This archive was generated by hypermail 2.1.8 : Fri Apr 30 2004 - 23:17:29 CDT