lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Document ID shuffling under 2.3.x (on merge?)
Date Tue, 11 Mar 2008 21:39:19 GMT
On Tuesday 11 March 2008 19:55:39 Michael McCandless wrote:
> Hi Daniel,
>
> 2.3 should be no different from 2.2 in that docIDs only "shift" when
> a merge of segments with deletions completes.
>
> Could it be the ConcurrentMergeScheduler?  Merges now run in the
> background by default and commit whenever they complete.  You can get
> back to the previous (blocking) behavior by using
> SerialMergeScheduler instead.

That was my first thought, but SerialMergeScheduler doesn't cause the problem.  
I've done a little more investigation since; it turns out that if I don't 
call optimize() then the problem doesn't occur.

Could it be that optimize(int,boolean) is storing the segments to optimise in 
a HashSet, which by its nature reorders the segments?

> If it's not that ... can you provide more details about how your
> applications is relying on docIDs?

As far as that, we assume that if there are N documents in the index then the 
next document ID will be N (we determine this before adding the document.)  
As we're only doing this in a single thread and we never delete documents, 
this was previously safe.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message