lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Lux <math...@juggle.at>
Subject Re: Incremental updates / slow searches.
Date Tue, 10 Oct 2006 06:14:59 GMT
Rickard Bäckman wrote:
> Hi,
> 
> we are using a search system based on Lucene and have recently tried to add
> incremental updating of the index instead of building a new index every now
> and then. However we now run into problems as our searches starts to take
> very long time to complete.
> 
> Our index is about 8-9GB large and we are sending lots of updates / second
> (we are probably merging in 200 - 300 in a few seconds). Today we buffer a
> bunch of updates and then merge them into the existing index like a batch,
> first doing deletes and then inserts.

Hi Rickard,

Eventually you can try another strategy: If your "main" index doesn't
change that much over a short time, you can have it unchanged for let's
say a day. All inserts are done in another small index and searching is
done with a MultiSearcher using both Indices.

Benefits:
 + Big index remains optimized
 + Small index can be optimized after every x-th update, it doesn't take
that long
 + Search time reduced due to optimized indices
 + Small index can be updated on a remote indexing machine and then
distributed to the search servers (which could be a pain in the ass with
9 Gigs of data *gg*)
 - Merge has to be done in the night
 - Deletes have to be scheduled in the night
 - If you don't do the deletes you have eventually outdated results
(depends on your scenario)
 - You are using multisearcher, which might be somewhat slower

Of course you can also do the deletes in the main index within the
updates, but then your index is not that optimal any more. Its kind of
bargain between accuracy and runtime, as everything in retrieval.

Obviously there are pros and cons of this strategy and whether it is
appropriate for you depends on the use case -> well ... I'd like to
other opinions too ;)

hope that helps a bit,
  Mathias

-- 
    '   '    '
      '   '    '     Mathias Lux
 o/          '  \o   mathias@juggle.at
 /-'            -\   skype://dermotte, icq # 1988617
/\               /\  http://www.SemanticMetadata.net

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message