lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Armbrust <daniel.armbrust.l...@gmail.com>
Subject Lucene Document order not being maintained?
Date Wed, 05 Apr 2006 15:24:09 GMT
I'm using Lucene 1.9.1, and I'm seeing some odd behavior that I hope 
someone can help me with.

My application counts on Lucene maintaining the order of the documents 
exactly the same as how I insert them.  Lucene is supposed to maintain 
document order, even across index merges, correct?

My indexing process works as follows (and some of this is hold-over from 
the time before lucene had a compound file format - so bear with me)

I open up a File based index - using a merge factor of 90, and in my 
current test, the compound index format.  When I have added 100,000 
documents, I close this index, and start on a new index.  I continue 
this until I'm done with all of the documents.  Then, as a last step, I 
open up a new empty index, and I call addIndexes(Directory[]) - and I 
pass in the directories in the same order that I created them.


This allows me to use higher merge factors without running into file 
handle issues, and without having to call optimize.

The problem that I am seeing right now, is that when I look into my 
large combined index with Luke, Document number 899 is the 899th 
document that I added.  However, Document 900 is the 49860th document 
that I added.  This continues until Document 910, where it suddenly 
jumps to the 99720th document.

Is this a bug, or am I misusing something in the API?

Thanks,

Dan


-- 
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message