bar code bit encoding.

From: Chris Schaefer <evm_at_1reality_dot_org>
Date: Tue Sep 16 2003 - 11:50:52 CDT

>Jan Karrman <jan@it.uu.se> wrote:
>|I don't understand why we should store data in bytes or nibbles.
>|Why not store each individual bit as an element in a vector?
>
>See my example recently. If we align on nibbles--or really on digits
>for Code128--then during debugging we can recognize quickly, for
>example, that digit #17-19 encode the vote in contest #6. It's a lot
>quicker and more direct to figure out if and why those digits aren't
>getting the right values.
>
>Programmers find it really easy to read decimal digits, but not so easy
>to read bits (especially when those bits have an indirect relationship
>to the encoded digits).
>
>That said, I don't really care that much. If anyone simply writes a
>'votes2digits()' function that does something consistent and
>interpretible... I'll happily accept the working code, no matter what
>encoding is used (see the "What we need right away" note).

Jan:

At this point our primary goal is to get a working demo running. For
this demo it's not super important which encoding we use since we all
agree that alot of this code will need to be completely rewritten
before we have a real system. So from the perspective of readability
during debugging in the demo the encoding technique is not that
important. Whatever makes the developer's job easier ( you, I think,
right ? )

However, I believe there may be one reason to consider an bit
encoding technique will different from the 116 bit vector. And this
has to do with reliability of scanning in the final demo. We will
be using inexpensive scanners. It makes sense to me ( and I have
little experience with barcodes other than supermarket shopping) that
the shorter the code is, the more likely it will be to read reliably.
If this is true, then if we can have with fewer bits than 116 then
wouldn't our reliability go up even in the face of using cheap
scanners? You have far more experience than me in this area. I'd be
curious of your opinions here!!!

The 116 bit vector represents uncompressed data. Since we know the
precise information that this data represents, then we can develop a
more compact encoding for it. That was what I started down the
pathway of doing. For example, in the first sets of ballot items we
know that only ONE of the bits can be checked at a time, the voting
software prevent any other outcome. So instead of using the bits
directly it makes more sense to simply encode the bit number which is
on ( with zero representing no bit on ). The only question then
becomes the size of that number. For most of the ballot items it
turns out that a 3 bit number will represent the data quite well. If
we went through the whole ballot this way, then we'd end up with 72
bits total. This is roughly 60% the data of 116 bits. The point
about doing nibble align is just that 3-bit numbers are hard to work
with. So by increasing to 4 bits, we get an even nibble. nibbles
aren't too bad to work with. now we're at about 88 bits. but it's
reasonably easy to program. 'nuf said.

( BTW for the rank vote at the end, I assume that you need to record
TWO numbers one for the candidate and one for the rank. This is
because I assumed that for some voting system you can rank more than
one candidate at the same level. )

Again, I'm not really the right person to make the tradeoffs, since
I'm not doing the coding and i don't know much about scanners. In
my opinion whatever works best and can get done most quickly is what
we should go with.

-C-

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Chris Schaefer                                   Email: 
chris<AT>1reality(DOT)org
             Professional Bit Twiddler and student of reality.
                 "Global Information, Local Production"
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Tue Sep 30 23:17:06 2003

This archive was generated by hypermail 2.1.8 : Tue Sep 30 2003 - 23:17:09 CDT