Options arising from scanner tech...

From: Jim March <jmarch_at_prodigy_dot_net>
Date: Sat May 07 2005 - 11:38:10 CDT


Arthur Keller asked for my help looking for "megascanners" suitable for
central tabulator operations.

I did so yesterday (a day late, sorry Arthur!) and found that first,
suitable gear is available at a price point we can cope with and second,
ultra-high-end bulk scanners have a number of features that give us
interesting options, now or in the future.

The best "bang for the buck" I was able to find so far is a $6k range
Canon, apparantly their best scanner to date:


I won't repeat the features/specs listed at that site but here's some
highlights and notes:

SPEED: 90 pages per minute doing single-sided mono 200x200dpi. I
suspect that resolution is adequate. The speed I'm a bit more concerned
about but when you find scanners that can do double that speed you find
the price is far MORE than double. This scanner is right at the sweet
spot of "bang for the buck", and I would submit that a bank of four of
these would be much more useful (never mind failure redundant) than a
single monster that's four times faster and more expensive than all four.

DUPLEX SCANNING: yup, it can scan both sides at once and when doing so
is rated 180ppm - this is very common among high-end scanners suggesting
that the mechanical paper-handling is what's driving the "speed limit"
versus the actual scanning. It also suggests that if we start running
out of space on the "final OVC terminal product" printed output (due to
large complex races) we can do double-sided versus legal-size paper as
has already been contemplated. And duplex printers for that side are
now down in the $350 or less range, with a particular HP looking like a
good deal there with it's 250-page paper hopper:

11x17 SCANNING: this is yet another option for "really REALLY big
ballots". But consider too that if we do like the idea of "two paper
copies", a sheet of 11x17 perforated down the middle turns into two
8.5x11s. I'm not trying to restart that discussion, just noting the
scanner's capability. Naturally it can do smaller sheets, standard
letter, legal, Euro-weenie formats, etc...

INTERFACES your choice USB2 or SCSI3. I would guess that if we're going
to run multiple scanners off of one PC, SCSI3 might be more
robust...esp. if we used a multi-ported SCSI card (basically two or more
complete controllers on one PCI card). See "tabulator design notes"

PAPER THICKNESS: 0.06 - 0.15mm when autofeeding.

INPUT HOPPER: "50mm", at least enough for a 500-sheet hopper load. More
expensive printers can do 1,000.

OPTIONS/CONSUMABLES: Per the site, "Imprinter, Endorser ED600, Hard
Counter, Barcode Module, Exchange Roller Kit". Either the imprinter or
endorser should be something we could use to date/time stamp the backs
or edges at each scan, providing an audit trail of scans? The "hard
counter" also has audit implications if it's reliable and the "barcode
module" has interesting possibilities! Again, I'm not proposing any
given use of these, I'm just commenting on what seems possible...

Oh, and the replacement rubber roller kits are user-installable. Dunno
how expensive but...if cheap enough, we might want to do it every two
years before a major election? Rubber ossifies and turns to crap after
a while...


Drivers and long-term support: it strikes me that there's almost
certainly no Linux drivers available for these scanners. We'll have to
do our own? And Canon will no doubt come up with later hardware models
down the road that will need, you guessed it, more drivers (or at least
tweaked drivers).

But there may be an opportunity here. We might want to partner up with
a company that's in the Linux business and would have an interest in
megascanner drivers being available. We do an initial driver for the
current model and then we cut a deal with Red Hat, Novell or ??? to
offload scanner driver support onto them as a long-term thing. Or maybe
even partner with Canon, who might have an interest in making sure Linux
is supported for their megascanners. Maybe a co-development deal on the
initial driver. Either way, as long as the scanner drivers are open
source/GNU/etc we don't need to be constantly screwing around with that
aspect if we can help it.


Tabulator design notes: I'm a bit concerned about the volume and speed
of data transfer and processing needed to support a whole bank of these
on one PC, even a very high end one.

Here's a thought to kick around: instead of building a single "megabox",
let's build a Linux cluster of lesser rigs each processing input from
one scanner! Lookit, a P4 2gHz box is pretty cheap, even with plenty of
RAM - under $700ish since we don't need a major graphics card. Network
as many as you need scanners (at least two even in a small county for
redundancy) via 100gig Ethernet, put a really good disk controller and
hard disk on one (or two?), have each lesser box dump it's recieved and
processed scans up using a protocol that lets them "take turns" enough
to prevent disk churn. GNU/Linux databases that work on a cluster have
to be available by now, right?

Each unit in the cluster can connect with it's scanner via USB2 instead
of SCSI. USB isn't as good an I/O, true, but at least if somebody kicks
a cord nothing fries and it should work OK for one scanner on one CPU.
(Software might have to process each "stack" of sheets as a "batch" and
if the batch fails halfway through via jam, kicked cord, scanner dies,
whatever it kicks out that "batch" and tells the user to start over with

(If anybody is freaking out over the cost of the cluster and $6k per
scanner, please remember that Diebold's tabulator *software* alone is
$44k. We can do the hardware for a five-PC cluster and five scanners
for that.)

If anybody wants to look at other scanners, feel free...two good
shopping links I found are:



You'll see that most of the features found on this Canon are found on
other big scanners: single-pass duplex scanning, 11x17 paper support,
optional imprinters/markers of various types, counters, etc. Note the
Fujitsu and HP models as being the next jumps up from this Canon.

Oh, one more thing: Diebold connects multiple scanners to their single
tabulator box via RS232!!! The only way they could possibly do that is
if their central count mega-scanners (used mostly for absentee ballots)
have as much internal processing and ballot layout RAM/firmware inside
as their precinct optical scanners. In other words, their scanners (big
and small) are custom "black boxes" themselves and can store a
vote-shaving process...but they're also transmitting FAR less data up
the wire to the PC.

We aren't going to be able to hack those custom scanners for our
purposes. Instead we'll have "dumb" scanners and do much more intense
data processing at the PCs...which is why I starting thinking "cluster".

Food for thought, comments welcome, etc.

OVC discuss mailing lists
Send requests to subscribe or unsubscribe to arthur@openvotingconsortium.org
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
Received on Tue May 31 23:17:24 2005

This archive was generated by hypermail 2.1.8 : Tue May 31 2005 - 23:17:52 CDT