lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Lucene Document order not being maintained?
Date Wed, 05 Apr 2006 21:23:09 GMT
Dan Armbrust wrote:
> My indexing process works as follows (and some of this is hold-over from 
> the time before lucene had a compound file format - so bear with me)
> 
> I open up a File based index - using a merge factor of 90, and in my 
> current test, the compound index format.  When I have added 100,000 
> documents, I close this index, and start on a new index.  I continue 
> this until I'm done with all of the documents.  Then, as a last step, I 
> open up a new empty index, and I call addIndexes(Directory[]) - and I 
> pass in the directories in the same order that I created them.
> 
> This allows me to use higher merge factors without running into file 
> handle issues, and without having to call optimize.

As others have noted, this should work correctly.

I assume that your merge factor when calling addIndexes() is less than 
90.  If it's 90, then what you're doing is the same as Lucene would 
automatically do.  I think you could save yourself a lot of trouble if 
you simply lowered your merge factor substantially and then indexed 
everything in one pass.  To make things go faster, set 
maxBufferedDocs=100 or larger.  This should be as fast as what you're 
doing now and a lot simpler.

Or is that the part where I was supposed to "bear with" you?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message