lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Wed, 06 Sep 2006 23:57:40 GMT

On Sep 6, 2006, at 4:23 PM, Ning Li wrote:

> When do you add "merge-worthy" segments? I'd guess at the end of a
> session, when it's easy to decide which segments are "merge-worthy".

Right.  KS sorts the segments by size, then tries to merge the  
smallest away.  The calculation uses the fibonacci series, the idea  
being to perform the least number of merges while keeping the number  
of segments manageable.

> If so, however, a newer doc could get a smaller docid than an older
> doc, right? It's a nice property of Lucene that an older doc always
> has a smaller docid. I think some applications use this to decide
> newer/older versions of a document.

Correct.  That information is not preserved with this algorithm.

> This means no new documents are visible to IndexReader until a session
> is over. In some sense, "1 segment/commit per session" lets an
> application decide when a "merge" happens.

Yes.  And since there's only one class in KinoSearch which modifies  
the index (InvIndexer), all adds and deletes are committed at the  
same time.

Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message