Re: Limitations of Many Eyes -- in the context of votingsystems (fwd)

From: Alan Dechert <dechert_at_gmail_dot_com>
Date: Mon May 19 2008 - 20:24:34 CDT

Brian, I agree with what you say here 100% ... well, almost 100%. The only
thing I'd quibble with is how important. It's important, but on a scale of
1 to 5 where 5 is most important, I'd give it a 1.

The only conceivable [mis-] use of David's point against open voting
advocates would be by a hand count advocate claiming that computers should
not be used at all.

David's point may be used to refute an open voting advocate's claim that
making the software open source solves everything. However, no one is
making that claim.

We don't think we are going to do away with computers, and we aren't
claiming open source solves everything. David's point might be a good
argument against paperless DREs. But OVC has always been against paperless
voting, favoring systems with a tangible paper ballot instead. So, David's
point is moot -- for the most part, as far as we are concerned.

If we agree that computers will continue to be used in election
administration (especially in ballot presentation and vote tabulation), then
David's point supports simplicity and transparency in voting systems. The
fact that "detecting malicious backdoors is really hard," certainly does not
argue in favor of systems that are more complex and opaque.

Ping's dissertation contains some excellent explanations of ballot
prerendering (pg 59-66, and 157 ). Ping mentions that he got the idea from
Steve Bellovin. OVC pioneered the idea in 2004. Ping mentions OVC's use of
prerendering on pg 175. Prerendering enables us to get it done with much
less code and with screens that can be checked by anyone. Much better for
open voting advocates.

It might be good to set the context a little better. This experiment is
written up briefly in Ping's dissertation on pages 296 - 299.
http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-167.pdf

and
http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-136.pdf

Ping says in his dissertation, "Detecting malicious code in a code review is
extremely difficult" (pg 155). But he is not arguing against using
computers in his dissertation. On the other hand, he makes a pretty good
case for using computers (pg 9-10). And, he does not argue against open
source. In fact, his Pvote code is open and under the GPL.

Furthermore, David is a co-PI in the ACCURATE project. The ACCURATE
proposal ( http://accurate-voting.org/accurate/docs/proposal-feb2005.pdf )
funded by the National Science Foundation says,

     We will make a special effort to involve the Open Voting
     Consortium (OVC), a nonprofit group devoted to exploring
     the application of the open-source development process to
     the domain of voting and elections. We hope to be able to
     provide technical guidance to OVC volunteer developers,
     we plan to use the products developed by the OVC as test
     cases, and we may be able to use the efforts of OVC
     volunteer developers to implement and test results of our work.

Now, how could this possibly work if their code was secret, proprietary? In
case anyone missed it, our current demo disk uses the Pvote code for the
voter interface, which is documented in Ping's dissertation and the
assurance document (which David referenced
http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-40.pdf ). I often
refer to Ping's dissertation in my demo presentations.

Your point about open source being more likely to receive better scrutiny
than published source is a good one. Another relevant difference is the use
of specialized election-dedicated hardware compared to commodity hardware.
If it's open source but running on hardware that is generally not available,
then it won't receive as much scrutiny as software running on hardware that
everyone has access to.

I think Ping's works (and David's works) are extremely valuable. However,
we are working on creating a good system with manageable security that
enables public oversight -- not a perfect one that is inherently secure and
doesn't need public oversight. In Ping's dissertation, he lists some
assumptions (pg 139) regarding Pvote security. The assurance document lists
"Threats out of scope." (see pg 6). I understand why these are simply
listed but ignored, but in the real world, these are things that need
attention too. Voting system security exists in layers -- much of this has
to do with procedures surrounding the use of the hardware and software.

So, let's note David's point, but we need to forge ahead and concern
ourselves with creating the best voting system we can make.

Alan D.
>
> I'm forwarding the below email with David's permission. The email was
> sent to a list of folks interested in the intersection of security and
> open source. It's self explanatory, and many of you have probably already
> heard about the study. I think it's important, as advocates of open
> source voting systems, that we understand this study carefully; David's
> message sets it in an appropriate context.
>
> In my own opinion, instead of spending a lot of time and energy refuting
> the study (which could still have its flaws, and will be overgeneralized
> and spun by foes of open source voting), advocates of the security
> advantages of open source voting should instead acknowlege it as a reality
> of all software, and argue that open source still gives us the best shot
> of detecting exploits, rather than providing any sort of assurance that
> exploits could not exist because they would be quickly found. It should
> also reinforce the importance of differentiating between "published
> source" and true open source, as "published source" is still unlikely to
> get anywhere near the amount of scrutiny per line of code. It could also
> point out the importance of funding review efforts; open sourcing the tech
> doesn't cause code reviewers to appear, it only makes them possible.
>
> Most importantly, it should point out why publishing source code in any
> manner does not obviate the need for an auditable (and actually audited)
> voting process.
>
> Brian
>
> ---------- Forwarded message ----------
> Date: Sun, 18 May 2008 12:47:57 -0700 (PDT)
> From: David Wagner <daw@cs.berkeley.edu>
> Subject: Re: [ROS] Limitations of Many Eyes -- in the context of voting
> systems
>
>> The original reason behind the research that led to the static analysis
>> tools MALPAS, SPADE and SPARK was to detect trojan code in UK Government
>> security software.
>>
>> If we could implement open source software using programming languages
>> that support sound static analysis of data flow and path expressions, it
>> would be far easier to check for trojans.
>
> I used to think so, too. But Ping Yee and I recently did an
> experiment whose results have raised serious doubt about this in
> my mind.
>
> In Ping's PhD dissertation, Ping came up with an alternative architecture
> for a voting machine that vastly reduces its trusted computing base (down
> to < 500 lines of simple Python code). We hypothesized that this would
> make it much easier for reviewers to find unintentional and intentional
> flaws in the software. We decided to test this hypothesis by inviting
> several security experts to review our code.
>
> As part of the experiment, we injected three artificial bugs into our
> code, to see if the reviewers would find them. These bugs were an
> attempt to simulate a malicious backdoor that could influence election
> results. We chose these bugs so they would be unlikely to be detected
> by ordinary pre-election testing. We crafted the bugs to offer a range
> of difficulty in finding them: one that we expected to be easy to find,
> one that we expected to be of medium difficulty to spot, and one that
> we expected to be difficult for a code reviewer to notice.
>
> We then ran a code review, with a total of 4 expert reviewers. They were
> broken down into 3 teams of 1-2 reviewers, each team working separately
> for one day to review the buggy code. At the start of the day we
> told them "there exists at least one bug in this code!" and told them
> that we had inserted an artificial bug (we didn't tell them that we
> had inserted more than one until they found their first bug). We even
> revealed that the bug was in a narrow subrange of 100 lines of code, to
> give them an advantage and narrow the focus of their review. In fact,
> all 3 articially inserted bugs were in that range. We then sat back to
> see whether they found the bugs.
>
> One team found both the "easy" bug and the "medium" bug, without much
> difficulty, but never found the "difficult" bug. Another team found
> the "easy" bug after about four hours, but never found the "medium" or
> "difficult" bugs. A third team found the "easy" bug after about two
> hours but never found the "medium" or "difficult" bugs. After a total
> of about 33 reviewer-hours, no one ever found the "difficult" bug.
>
> Of course, had we actually been malicious, we would have inserted just
> the "difficult-to-find" bug, and we would have gotten away with it.
> (The only reason we inserted that we also inserted the "easy" and
> "medium" bugs was because we were worried that they wouldn't find the
> "difficult" bug!) I should also point out that we came up with these
> malicious backdoors after only 2-3 hours of brainstorming -- we only
> came up with the idea of inserting artificial bugs the night before,
> so we were in a hurry.
>
> In short, in 2-3 hours we were able to come up with a malicious backdoor
> that 4 reviewers could spending a total of 33 reviewer-hours could not
> find, even with the information that there was a bug and with the side
> information that it could be found within a particular subrange of 100
> lines of code. This suggests that malicious backdoors can be very, very
> difficult to detect, even in situations carefully crafted to be the best
> possible for the defenders.
>
> Let's compare to current commercial practice. Typical code reviewing
> practice today for commercial code involves 1-2 reviewers reviewing code
> at a rate of 100-200 lines/hour. In comparison, in our experiment the
> reviewers reviewed code at a rate of about 14 lines/hour -- they had a
> lot more time, and we had more eyeballs, and the malicious backdoor was
> still not detected.
>
> Or, we can put this in context by looking at the cost. I estimate that
> it cost something like $10,000 to hire 4 reviewers with these
> qualifications
> to perform this kind of code review. But in this experiment we reviewed
> only 100 lines of code; real-world systems are much larger than that.
> Imagine, for example, a industrial system with 10,000 lines of code;
> that's pretty tiny by commercial standards, yet the experiment we ran
> suggests that one could spend $1 million and still not detect a
> malicious backdoor. (And this isn't even taking into account the fact
> that in larger systems there are probably more opportunities for even
> more subtle kinds of backdoors.)
>
> For me, the results of this experiment were eye-opening, because
> they violated my intuition and my expectations so dramatically.
> The conventional wisdom says that there is no reliable way to check
> for malicious backdoors in a large, complex software system. I had
> been hoping that the situation might be different if one followed sound
> engineering principles, kept the system simple, and minimized the size
> of the codebase. Alas, it didn't work out that way. This experiment
> suggests to me that there is no feasible, reliable way to check for
> malicious backdoors in software of any size -- not even for small,
> simple systems that have been thoughtfully designed, architected, and
> implemented with the malicious code problem in mind.
>
>
> More details about this experiment can be found in Chapter 8 of
> Ping's thesis (http://zesty.ca/pubs/yee-phd.pdf). Even more details
> can be found in his techreport:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-136.html
>
>
> What about programming languages designed to support sound static
> analysis and formal program verification? Would they provide a reliable
> way to detect deliberately inserted backdoors and malicious logic? I
> don't know, but I'm not convinced yet.
>
> Our experiment wasn't designed to test that question. But, Ping
> put a lot of effort into manual verification of the code. He put
> together a 70-page assurance document, which was made available to the
> reviewers. This document contained his best attempt at a requirements
> analysis, a careful explanation of the threat model and the correctness
> claims he was making about his software, as well as rigorous analysis of
> each of those claims. He also translated those claims into invariants,
> preconditions, and postconditions on the code and performed a line-by-line
> analysis of the code, showing the pre- and post-conditions needed at
> each line to prove these claims. In other words, this analysis was
> along the lines of what an automated program verification tool might
> perform, except that Ping did it by hand. The assurance document is
> available here:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-40.html
> Ping did use a programming language subset that was explicitly designed
> to support human code review and chosen with simplicity in mind (details
> in Chapter 3 of the assurance document), though sadly we didn't have
> automated static analysis tools.
>
> I don't believe that "sound static analysis of data flow and path
> expressions" would have made it any easier to find any of the three
> malicious backdoors we artificially inserted.
>
> But let me be more charitable, and imagine the best possible program
> verification and analysis tool I can plausibly envision (even though
> I think it's beyond the state of the art to build such a thing today,
> this is the most I can think to ask for). I think such a hypothetical
> tool would have fared as follows:
> * A good program verification tool could have detected the "easy"
> bug. It falls squarely into the category of the kind of bugs that
> formal verification is intended to find.
> * A strict static analysis tool might have helped to detect the
> "medium" bug, though this is debateable -- I could argue it
> either way.
> * Hard to say, but I think formal verification would not have
> made a huge difference in the difficulty of detecting the "difficult"
> bug (I suspect developing a spec that would be sufficient to detect
> the bug is as hard as finding the bug -- the hard part is arguably
> recognizing the need for this particular special-case requirement,
> and once you realize that, I suspect you'll have a decent chance of
> finding the bug either way).
>
> One challenge here is that such tools require specifications. If
> we are assuming the possibility of malicious logic in the program, we
> also have to assume the possibility of malicious logic in the
> specification. Since formal verification tools can, at best, only
> verify the consistency of the program and the specification, a matching
> bug in both the program and the spec can evade detection. And checking
> detailed, program-level specs for malicious logic does not seem likely
> to be appreciably easier than checking programs for malicious logic.
>
> But that's not even the worst of the it. The worst of it is that
> we have to assume that a malicious insider will know that formal
> verification is being used and will craft the bug to minimize the
> chances that the formal verification tool will detect the bug. (After
> all, the malicious insider can run the formal verification tools himself
> and check whether his bug will be detected, and he can keep trying until
> he finds a bug that is unlikely to be detected by that tool.) In
> other words, we have to assume that the attacker will adapt to the
> defenses you put in place. If there is any chink in the armor, we
> have to assume the attacker will find it. That makes this problem
> challenging, even if we assume major advances in program verification.
>
>
> So for me, I've gradually come to the hypothesis that detecting
> malicious backdoors is really hard, even in the best possible case.
> I don't know how to do it with any confidence or assurance, for
> any non-trivial system, no matter how well-designed or implemented.
> Sounds like a difficult open problem to me.
>
> -- David
> _______________________________________________
> OVC-discuss mailing list
> OVC-discuss@listman.sonic.net
> http://lists.sonic.net/mailman/listinfo/ovc-discuss
> By sending email to the OVC-discuss list, you thereby agree to release
> the content of your posts to the Public Domain--with the exception of
> copyrighted material quoted according to fair use, including publicly
> archiving at http://gnosis.python-hosting.com/voting-project/

_______________________________________________
OVC-discuss mailing list
OVC-discuss@listman.sonic.net
http://lists.sonic.net/mailman/listinfo/ovc-discuss
By sending email to the OVC-discuss list, you thereby agree to release the content of your posts to the Public Domain--with the exception of copyrighted material quoted according to fair use, including publicly archiving at http://gnosis.python-hosting.com/voting-project/
==================================================================
= The content of this message, with the exception of any external
= quotations under fair use, are released to the Public Domain
==================================================================
Received on Sat May 31 16:17:02 2008

This archive was generated by hypermail 2.1.8 : Sat May 31 2008 - 16:17:04 CDT