Re: statistical study after next Tues election

From: Kathy Dopp <kathy_at_directell_dot_com>
Date: Sun Oct 31 2004 - 12:23:36 CST

>
> On Oct 29, 2004, at 7:40 PM, Kathy Dopp wrote:
>
>> to compare election results in counties using paperless DREs
>> with election results in counties using other voting machines to
>> if there is any statistical increased evidence/likelihood of election
>> rigging/hacking/unexpected-errors with paperless DREs.
>>

Douglas W. Jones said:
>
> Such a study should include, on a county by county basis, the
> percent undervote on the presidential race and the percent overvote
> (where overvotes are allowed).

I'm not sure how to fairly consider over-votes in counties where they can
occur with counties where e-voting machines prevent over-votes.

> in precinct -- voted on whatever precinct vote collection technology
> postal absentee -- voted on paper ballots, usually machine scanned
> early voting -- paper in some places, DRE in some places

Thank you. That breakdown (precinct, postal absentee, and early voting)
is very important. The wonderful thing about this breakdown is that it
may solve a problem I've been having with my study design because I
haven't been able to find Presidential election polls or exit polling
numbers BY COUNTY yet, with which to compare the actual election results
to look for patterns. (There are many possible ways to look for patterns,
and it will be most convincing if we can valididate any patterns we find
in more than one way.)

The postal absentee plus paper-based early voting could be the "control
group" with which to compare the election results by county to look for
patterns.

Are these three breakdowns of "in precinct, postal absentee, and early
voting" available on

>
> It is important not to lump these together, if this can be helped. It is
> also important not to just aggregate all counties using the
> same technology, but to look at the spread, within each technology,
> depending on the county.

Yes. I 100% agree. Each county's variance from its own control number(s)
needs to be evaluated separately, and each county's variance compared to
average variance of counties within its state.

>
> Here's an example from election 2000 in Florida:
>
> The residual vote (difference between turnout and sum of votes for
> one or another presidential candidate) was about 1.5 percent, on
> average, in counties using Optech optical mark-sense scanners, and
> 0.6 percent in counties using Global (now Diebold) precinct-count
> mark-sense scanners.

> The big question in my mind is, what did these 3 counties do really
> badly, and what did the best counties (more numerous with both
> brands of scanners) do right?

Absolutely Doug. There may be a correlation between high residual vote
rates and a county's variance between its precinct election results and
its absentee and early paper ballot election results, which would add
credibility to any patterns we find.

Doug, would you please consider being a collaborator on this study?

I have chosen five swing states to study: Florida, New Jersey, Ohio,
Arkansas, and Iowa. Arkansas and Iowa are control states, having no
touchscreen machines. New Jersey and Ohio have just a few counties each
with paperless touchscreen machines, and Florida has a slew. Most of the
paperless touchscreens are ES&S. (These are the only swing states that
met the conditions for a study because other swing states either had
uniform statewide voting systems or had mixed voting systems within
counties, according to http://vevo.verifiedvoting.org/verifier/ )
However, Five states may be too much work to get done in a short time, so
perhaps we could begin with Florida, and then work through other states
for comparison.

>
> This example illustrates the danger of just folding all the counties
> together into one figure for residual vote. The study must make it
> clear whether the residual figures were very consistent from county to
> county (probably evidence that the residual vote is a character of the
> voting system) or whether the residual vote varies widely with county

I have never considered lumping any counties together. Each county would
supply its own numbers to the study for each of the four studies I've
noticed could be done. My original plan was to compare variance in
counties using paperless touchscreens with variance in counties using
other voting methods, so I'm completely thinking like you are, we were
just considering using two different measures of evaluation, and the study
will be much improved to report both measures and to further see if the
two measures

There are four possible measures of variance our study could compare
counties using:

1. residual rates
2. precinct vs. paper ballot absentee+ paper ballot early
3. precint vs. exit polls,
4. precinct+absentee+early vs. election polls

Any one of these may give possible indications of problems with voting
equipment.

However I haven't yet figured out how to get data BY COUNTY for 3. or 4.
and I'm discovering it's a huge job to get and input data for any of these
but I am committed to doing it.

If we could as many of the above four measures as possible, and then see
if the result points are correlated with each other, that would show a
lot. The idea is to look for outliers and see if there is a graphical
and numerical pattern.

However, in particular, I want to compare patterns of variance in counties
with touchscreen voting machines and those without - not lumping them
together but seeing how strong the relationships are by breaking the
counties up in different ways and re-examining the data. (I think I'll
need to learn to use Matlab it may be called regression analysis, but I
have to dig out my old stats books or consult with a math prof first). I
need to gather the data together and do the preliminary arithemetic
(percents, subtractions, ...) and see what I see, before bothering any
math profs acquantainces for help doing the more complex statistical
analysis and probability calcs.

> (probably evidence that the residual vote is a character of the
> administration -- and depending on how many counties exhibit
> what residual vote rates, evidence of how easy the technology is to
> mis-administer).
>
> Getting hold of the necessary info is fairly easy -- but it varies by
> state. Some states, like Florida, make this easy. Just go to the
> secretary of state's web site and download the statewide results.
> They're typically available broken down by precinct, but county-wide
> totals are also offered. Some figures aren't released immediately, but
> only come out later.

Doug, I would really appreciate any help you can give me to get hold of
the necessary info. I had planned to become a paying member of
http://www.uselectionatlas.org/ to obtain election results by county, but
Dave Leip's numbers may not be broken down by precinct, early, and
absentee which are also needed.

>
> Visit county web sites. You can grab sample ballots there, and these
> samples are important!

Is that because residual errors may be due to ballot design? Darnit, I
need an assistant to do this study. It is a huge project just to gather
all the data and info.

>
> Do get the figures before the recount season begins! Don't let lawyers
> meddle with the numbers you use in this kind of statistical analysis.

Youch! What are you saying? The numbers posted on the county web sites
keep changing? Yikes.

Regards,

Kathy

>
> Doug Jones
> jones@cs.uiowa.edu

-- 
Kathy Dopp
Utah Count Votes
http://UtahCountVotes.org
phone: 435-608-1382
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Mon Nov 1 15:28:57 2004

This archive was generated by hypermail 2.1.8 : Mon Nov 01 2004 - 15:28:59 CST