lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Improving Search Performance on Large Indexes
Date Thu, 24 May 2007 18:38:37 GMT

Yes, take your big index and split it into multiple smaller shards.  Put those shards in different
servers and then query them remotely (using the provided RMI thing in Lucene or using something
custom), take top N results from each searcher, merge those, and take top N from the merged
result set.

You could also experiment with a memory mapped Directory implementation.

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy --  -  Tag  -  Search  -  Share

----- Original Message ----
From: Scott Sellman <>
Sent: Thursday, May 24, 2007 1:31:49 PM
Subject: Improving Search Performance on Large Indexes



Currently we are attempting to optimize the search time against an index
that is 26 GB in size (~35 million docs) and I was wondering what
experiences others have had in similar attempts.  Simple searches
against the index are still fast even at 26GB, but the problem is our
application allows the user a lot of options in searching, which can
generate complicated queries.  Based on previous posts we decided to try
splitting our index into multiple indexes and use ParallelMultiSearcher.
When we split our single index into 6 separate ones we recorded a 25%
decrease in response time on minimal load.  We haven't done any stress
testing on it yet, has anyone noticed problems with increased load when
using ParallelMultiSearcher?  What about using machines with more
processors in combination with the ParallelMultiSearcher, does this
result in much response time improvement?  Or is the slow down primarily
with disk access?


Any recommendations are welcome. 


Thanks in advance, 



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message