lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jt oob <>
Subject (Distributed) Search system designs
Date Fri, 14 May 2004 14:19:45 GMT

I currently have a working search system based on lucene 1.2 as follows

14 indexes, average size just over 1G, min size 36M, max size 3.3G,
total size 15G.

Search times are currently between 20s and 4 minutes depending on the
query, the system uses a multisearcher to search all indexes. The
indexes are currently all stored on an internal raid.

There are lots of things wrong with the index, including many words
which should be in stop lists which aren't etc.

The search is run on a linux system with 8G of RAM and 2G of swap.

- - - -
I am looking at writing a replacement system, and this time trying to
everything properly, writing document parsers etc.

Any pointers would be well recieved!

The questions:

1) The documentation about how to get a basic lucene search going is
great, is there any similar documentation or a HOWTO on how to design
and implement distributed searches?

2) For distributed searches what are the best options for building in
redundancy? Is a large shared storage solution such a SAN required, or
will duplicating indexes on several machines suffice?

3) I had been told that using RAMDirectory on a linux system was
pointless because the kernel cached files in spare RAM anyway. Is this



Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message