lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <>
Subject Re: speedup indexing
Date Tue, 07 Aug 2007 08:47:52 GMT

"Mike Klaas" <> wrote:

> > On 8/6/07, testn <> wrote:
> >>
> >> 2. To improve indexing speed, you can consider using the trunk  
> >> code which
> >> includes LUCENE-843. The indexing speed will be faster by almost  
> >> an order of
> >> magnitude.
> While a speedup should be expected, I don't know that an order of  
> magnitude is a realistic expectation to convey.  Unless, of course,  
> you're speaking in base two ;)

Right, it's important to not overstate things here...

First off, the speedups in LUCENE-843 *only* apply to the actual time
spent in Lucene's indexing code.  Ie, time spent retrieving the doc,
running the analyzer, etc., will not get any faster (though there is
ongoing work to speed up the core analyzers...) so if the bulk of the
time in an application is not in Lucene's indexing then the speedups
will be minor.

Second off, the speedups are best for smaller docs, and, at present
you still need to either change your writer to flush by RAM or set a
large "maxBufferedDocs" in order to see the best gains.

It's also important to try these suggestions too -- they can
potentially make even more difference than LUCENE-843:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message