Re: Compression, encoding, entropy

From: David Mertz <voting-project_at_gnosis_dot_cx>
Date: Mon May 03 2004 - 10:26:36 CDT

>> ECCs do a better job with the integrity constraint you want than do
>> self-delimiting contests.
> It's a both - and, not an either - or.

Well, it's a both if bandwidth isn't an issue. But since the whole
discussion (that is the one that I attached the current subject line
to[*]) started with the bandwidth constraints of barcodes... well, best
case seems like an interesting topic for this thread.

Also, it's only a both if extra layers of software that potentially
fail aren't an issue. Adding the source code for more complicated
self-delimiting encodings is another point of potential error. There's
a trade-off between redundancy that "might be cool" to have and the
danger introduced by the source code that knows about that redundancy.

I.e.

     The cheapest, fastest and most reliable components of a
     computer system are those that aren't there.
       --Gordon Bell, Encore Computer Corporation

In truth, if we decide--for whatever reason--to take the plunge to
using 2-D barcodes, we've just given ourselves enough breathing room
that I stop carrying in the slightest about an efficient encoding. You
almost can't be inefficient enough to use the whole storage space (even
adding phonemes in there). For example, from what I can see, PDF417
(which is public domain) should hold somewhere along the line of 16000
bits, which is quite a breather from the 100-ish bits we're worrying
about for Code128.

[*] BTW, it's good to start new Subject lines when threads diverge.
For example, the subthread of this one that discusses printer
technologies would be more useful if it was called, say, "barcodes and
printer technologies" since it isn't about compression/encoding
anymore.

> The efficiency in encoding compared with straight binary is:
> (d*g)/floor(lg(n)+1)

OK, maybe I'll add that formula to my script.

Incidentally, you say you don't know Python, Arthur... but you just
wrote it in all your formulas. :-) That's the beauty of the language,
things are spelled the way you would generally guess they are without
knowing the language.

> That can be done with a length prefix. The problem is how long is a
> length prefix is needed?

Right, except it need not be a prefix, as I said. It can be contained
in an entirely different location on the ballot, since it is only
needed or used in the case of exceptional failure.

> Personally, I like the idea of a fixed station under which you place
> the ballot rather than a wand scanner type device.

Yes, yes, yes! Let us PLEASE lose the wand ("cat") scanners as soon as
humanly possible. They are a terrible kludge that is utterly
embarrassing as soon as we show it to actual blind people. I
absolutely cannot imagine that a non-fixed scanner will be of any value
for actual production. The CueCat wants were a cheap way for
developers to test things, but I don't think they should even ever be
shown to anyone outside the developer group... they look far too
unprofessional.
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Mon May 31 23:17:04 2004

This archive was generated by hypermail 2.1.8 : Mon May 31 2004 - 23:18:15 CDT