Re: Mining the data in the correspondance archives

From: laird popkin <lairdp_at_gmail_dot_com>
Date: Wed Dec 01 2004 - 11:07:36 CST

You can see a very interesting view of our discussions by going to
http://www.kartoo.com/ and searching for
"site:gnosis.python-hosting.com hypermail". Then hilite the site icon
and click on 'more pages of this site' in the menu on the bottom
right. It generates a graphical structured view of the email archive,
then you can drill down into particular topics. For example, if you
click on "security" (i.e. search for "site:gnosis.python-hosting.com
security") you get a nice breakdown of our security discusions.

- LP

On Tue, 30 Nov 2004 20:17:05 -0800, Keith Copenhagen <k@copetech.com> wrote:
> Perhaps we could generate a (limited) list of key terms :
>
> For example :
> "Open Source"
> Security
> "Hardware Platform"
> "Operating System"
> "Readable Ballot"
> Canvassing
> "Voter Rolls"
> Audit
>
> The resulting list of emails could turn into wiki pages (maybe with a grep
> +/-1 line kind of intro) and over time we could by hand cull them back to
> the
> ones that capture the consderations and the concensus.
>
> -Keith
>
> On Tue, 30 Nov 2004 13:42:03 -0800 (PST), Edmund R. Kennedy
>
>
> <ekennedyx@yahoo.com> wrote:
>
> > Hello:
> >
> > Um. David, I mean preparing a separate index of the documents that
> > people could consult to see what's there. I didn't mean 'indexing'
> > although that's a reasonable interpretation. That sort of key index is
> > effectively invisible to the final user and isn't not very useful when
> > you don't know what key word to use. When I turn to a paper index and
> > don't know the search term I can skim quickly through the index and
> > usually find what I'm looking for quickly.
> >
> > I'm kind of thinking something like an index of threads, an index of
> > authors, resorting those by date, or size, etc. Right now, the
> > correspondence archives are effectively a knowledge swamp. The ultimate
> > goal is try to drain the swamp and systematically extract the
> > information to end up in the Wiki or even the FAQ. No, I'm not hip deep
> > in alligators yet.
> >
> > David Mertz <voting-project@gnosis.cx> wrote:
> > On Nov 30, 2004, at 3:35 PM, laird popkin wrote:
> >> The indexing part is well solved by a number of open source text
> >> indexing engines, such as Apache's Lucene, or Zilverline.
> >> KnowledgeTree also looks very interesting -- it's a full fledged
> >> document management system that includes text indexing. So that might
> >> be overkill.
> >
> > Well, yeah. But it's even easier to solve using Google.
> >
> > That's what the archive site already does; google happily spiders our
> > email archive with a good regularity. The search box there is just a
> > google search with a "site:..." restriction (and I think a little
> > kludge where I add the term 'hypermail' which is the archive generating
> > program that puts a little blurb on archived pages; just to exclude
> > other documents I may host at the same domain).
> >
> > _______________________________________________
> > OVC discuss mailing lists
> > Send requests to subscribe or unsubscribe to
> > arthur@openvotingconsortium.org
> >
> >
>
> --
> Keith Copenhagen
>
> _______________________________________________
>
>
> OVC discuss mailing lists
> Send requests to subscribe or unsubscribe to arthur@openvotingconsortium.org
>

-- 
- Laird Popkin, cell: 917/453-0700
_______________________________________________
OVC discuss mailing lists
Send requests to subscribe or unsubscribe to arthur@openvotingconsortium.org
==================================================================
= The content of this message, with the exception of any external 
= quotations under fair use, are released to the Public Domain    
==================================================================
Received on Fri Dec 31 23:17:01 2004

This archive was generated by hypermail 2.1.8 : Fri Dec 31 2004 - 23:17:22 CST