httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Behlendorf <br...@organic.com>
Subject Re: sane mail-archive
Date Fri, 09 Jan 1998 23:31:08 GMT
At 06:06 PM 1/9/98 -0500, Rodent of Unusual Size wrote:
>Brian Behlendorf wrote:
>> 
>> I'll post the spec soon... but the basic idea is that for each mbox file
>> you have a dbm file (or sql database table, whatever) storing message
>> beginnings, the basic headers, x-ref's, etc.  I.e., a threads database like
>> news servers have.  The search engine returns hits to particular messages,
>> showing the metainfo.  every message has a unique URL.  Etc.
>
>I have a half-implemented version of exactly this.  The major stumbling
>block has been the index; I want full-text, and my first pass (before
>I had to turn my attention elsewhere) involved breaking the mbox into
>separate pieces and running a wais index of it, and rewriting the
>index pointers to be MIDs rather than file names.

For the search tool I'd suggest just grabbing glimpse or swish or something
and hacking it to not be file-system based but "feed me text, feed me an
identification string"-based.

	Brian


--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
specialization is for insects				  brian@organic.com

Mime
View raw message