Re: Open Source Quality

From: Edward Cherlin <edward_dot_cherlin_at_etssg_dot_com>
Date: Mon May 10 2004 - 17:19:19 CDT

On Sunday 09 May 2004 11:58, Douglas W. Jones wrote:
> On May 9, 2004, at 12:49 PM, David Mertz wrote:
> >> The presence of these errors argues (minimally) for
> >> garbage-collected languages and fulltime array bounds
> >> checking to help a sloppy programmer detect these problems.
> >
> > I.e. Python is the *right* language to write OVC's reference
> > in.
>
> Well, the same could be said for Java, LISP, Simula 67 and a
> number of other languages. This merely demonstrates that
> Python is a member of a class of relatively safer languages.

APL, also.

Another issue is checking program logic to see whether all
possible input cases are handled. Defensive programming
technique calls for a "This Can't Happen" case in every case
construct, and a number of similar measures. A lot of programs
need to handle malformed input routinely, but some people seem
to think that it's your own fault if you apply text utilities to
binary files.

Another issue is choosing a language that is simple enough so
that the implementation can be verified. FORTH is the leader by
this measure, though of course it fails on bounds checking and
garbage collection. FORTH compilers in FORTH are typically half
a page long, and depend on less than 8K of machine language. The
FORTH inner interpreter is, on some computer architectures, one
instruction (indirect jump with post-increment).

Other than that, LISP and APL have the simplest grammars, and
there are single-person implementations of both with quite
readable code. The parser definition for Iverson's latest
incarnation of APL, called J, is 12 lines in all.

> > Overall, it's quite striking how much more reliable GNU
> > utilities are than commercial equivalents.
>
> This is indeed striking evidence that open source leads to
> higher quality standards than is normal in the industry.
>
> > However, the paper made a peculiar error in its description.
> > It described the bugs detected as including "infinite
> > loops." Well, there's this little thing called the Halting
> > Problem that says you don't know in the general case if a
> > program will ever end.
>
> I'll lay odds that most of the hangs detected by the Fuzz test
> suite are indeed infinite loops.
>
> I have, however, made this mistake once in my career, when I
> assigned students the job of writing a sort program in LISP
> (limited to the functional programming subset) and tested them
> with a list of 10 random digits. Most solutions produced
> results faster than you could blink your eyes, but one
> solution had "obviously" hung. We gave that student a zero,
> and the student complained. "You didn't give it long enough"
> he said. "How long is enough?" I asked. "several hours," he
> replied. It turned out that, while most students had sort
> algorithms that took O(n log n) time and a few had O(n**2)
> algorithms, his was at least O(n**4). We never did figure out
> how he'd done this, but part of the trick was to do array
> indexing suing an O(n**2) algorithm to pick the n'th item out
> of a list. (This story happened 20 years ago, but the
> relative improvement of computer speed since then is pretty
> much irrelevant.)

There is a somewhat similar case in the Cryptonomicon, where the
student fails an engineering class for turning in a
mathematically correct program that would take millions of years
to run. In his case, however, that was the best result known
rather than one of the worst.

> Doug Jones
> jones@cs.uiowa.edu

-- 
Edward Cherlin, Simputer Evangelist
Encore Technologies (S) Pte. Ltd.
New voices in the global conversation
http://www.ryze.com/go/Cherlin
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Mon May 31 23:17:31 2004

This archive was generated by hypermail 2.1.8 : Mon May 31 2004 - 23:18:16 CDT