lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Batch Indexing - best practice?
Date Mon, 15 Mar 2010 14:47:48 GMT
On 03/15/2010 10:41 AM, Murdoch, Paul wrote:
> Hi,
>
>
>
> I'm using Lucene 2.9.2.  Currently, when creating my index, I'm calling
> indexWriter.addDocument(doc) for each Document I want to index.  The
> Documents aren't large and I'm averaging indexing about 500 documents
> every 90 seconds.  I'd like to try and speed this up....unless 90
> seconds for 500 Documents is reasonable.  I have the merge factor set to
> 1000.  Do you have any suggestions for batch indexing?  Is there
> something like indexWriter.addDocuments(Document[] docs) in the API?
>
>
>
> Thanks.
>
> Paul
>
>
>
>
>    
You should lower that merge factor - thats *really* high.

You shouldn't really need much more than 50 or so ... and for search 
speed your going to want fewer segments anyway -
if your just going to end up optimizing at the end, there is no reason 
for such a large merge factor - you will pay for most of what
you saved when you optimize.

That is very slow by the way. Should be much faster - especially if you 
are using multiple threads.

-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message