lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Querna <c...@force-elite.com>
Subject Performanmce of MultiSearcher?
Date Sun, 27 Mar 2005 00:07:18 GMT
Hello,

I am working on using Lucence based indexes for the ASF's mod_mbox. 
Current versions of mod_mbox support MIME, and I am trying to add full 
text searching. (Then we can completely remove Eyebrowse)

Currently I am hacking around with the C++ (CLucence) Implementation, 
but I intend to migrate to Lucence4c shortly.

I was structuring one Lucence Index per-mailing list.  To search All 
mailing lists, I was planning on using a MultiSearcher.

Currently, the ASF public mail archives use about 17 Gigs, uncompressed, 
in the raw mbox format.

There are also about ~300 mailing lists in the public archives.

Can a multi-searcher quickly search 300 different indexes?  I am 
thinking that it will not.  300 separate indexes is lots of files to 
scan, even if Lucence is fast.  Any experience from other users would be 
helpful.

Would it better to have a Single Main Index, for all of the lists, and 
include the List Names as a keyed field?

I suspect most searches would be restricted to one or two lists, but I 
would like good performance if I wanted to search all of the ASF lists.

Ideas/Comments?  Anyone willing to help me write some C :) ?

Thanks,

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message