lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: Incremental updates / slow searches.
Date Tue, 10 Oct 2006 03:52:59 GMT
On 10/9/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> don't forget to optimize your index every now and then as well... deleting
> a document just marks it as "deleted" it still gets inspectected by every
> query during scoring at least once to see that it can skip it, optimizing
> is the only thing that truely removes the "deleted" documents.

I'd refine that statement to "optimizing is the easiest way to remove
any deleted documents that still exist in the index".

Deleted documents are removed from segments that are merged, so it
depends on things like the mergeFactor, maxBufferedDocs, and where the
deleted docs are in the index (in the smallest or largest segments).
Some deleted docs will be removed quickly, but some won't.

Optimizing an index also has a beneficial effect on search speed even
beyond removing all of the deleted docs.  Each index segment is
actually a complete index on it's own... so if search is generally
O(log(N)), searching across M segments of since N will take M *
log(N).  If those segments are "optimized" into a single segment, the
search will be O(log(M*N)).

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message