Re: Mining the data in the correspondance archives

From: Keith Copenhagen <K_at_copetech_dot_com>
Date: Tue Nov 30 2004 - 22:17:05 CST

Perhaps we could generate a (limited) list of key terms :

For example :
"Open Source"
Security
"Hardware Platform"
"Operating System"
"Readable Ballot"
Canvassing
"Voter Rolls"
Audit

The resulting list of emails could turn into wiki pages (maybe with a grep
+/-1 line kind of intro) and over time we could by hand cull them back to
the
ones that capture the consderations and the concensus.

-Keith

On Tue, 30 Nov 2004 13:42:03 -0800 (PST), Edmund R. Kennedy
<ekennedyx@yahoo.com> wrote:

> Hello:
>
> Um. David, I mean preparing a separate index of the documents that
> people could consult to see what's there. I didn't mean 'indexing'
> although that's a reasonable interpretation. That sort of key index is
> effectively invisible to the final user and isn't not very useful when
> you don't know what key word to use. When I turn to a paper index and
> don't know the search term I can skim quickly through the index and
> usually find what I'm looking for quickly.
>
> I'm kind of thinking something like an index of threads, an index of
> authors, resorting those by date, or size, etc. Right now, the
> correspondence archives are effectively a knowledge swamp. The ultimate
> goal is try to drain the swamp and systematically extract the
> information to end up in the Wiki or even the FAQ. No, I'm not hip deep
> in alligators yet.
>
> David Mertz <voting-project@gnosis.cx> wrote:
> On Nov 30, 2004, at 3:35 PM, laird popkin wrote:
>> The indexing part is well solved by a number of open source text
>> indexing engines, such as Apache's Lucene, or Zilverline.
>> KnowledgeTree also looks very interesting -- it's a full fledged
>> document management system that includes text indexing. So that might
>> be overkill.
>
> Well, yeah. But it's even easier to solve using Google.
>
> That's what the archive site already does; google happily spiders our
> email archive with a good regularity. The search box there is just a
> google search with a "site:..." restriction (and I think a little
> kludge where I add the term 'hypermail' which is the archive generating
> program that puts a little blurb on archived pages; just to exclude
> other documents I may host at the same domain).
>
> _______________________________________________
> OVC discuss mailing lists
> Send requests to subscribe or unsubscribe to
> arthur@openvotingconsortium.org
>
>

-- 
Keith Copenhagen
_______________________________________________
OVC discuss mailing lists
Send requests to subscribe or unsubscribe to arthur@openvotingconsortium.org
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Tue Nov 30 23:17:43 2004

This archive was generated by hypermail 2.1.8 : Tue Nov 30 2004 - 23:17:44 CST