lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Index optimization ...
Date Wed, 30 Jul 2008 15:12:28 GMT
What version of Lucene are you using?  What is your current  
mergeFactor?  Lowering this (minimum is 2) will result in an index  
that is closer to "optimal" since an optimized index is just one that  
has all the segments merged into a single segment and a mergeFactor of  
2 just means there are only ever 2 segments (the docs could make this  
clearer).  The tradeoff is that you may need to do merges more often.   
If you are using Lucene 2.3.x these merges can now take place in the  
background, so this may not be as big a penalty as it once was.   
Still, the fact is, optimize has to go through, in the end, and merge  
your segments into one big segment.  This is a lengthy undertaking on  
a large index.

I'm not sure, however, if any of this will reduce your overall time.   
I suppose it depends somewhat on your update rate.  It is possible  
that the slower indexing could offset any gains had by a faster  
optimize.  Another option may be to keep the mergeFactor higher, but  
then every so often do partial optimizes to keep your index closer to  
optimal such that a final optimize may be speed up

Another question is have you measured your query performance on an  
unoptimized index?  Is it acceptable?  Are you only adding new  
documents or are you also deleting docs in that 4 hour time period?

Bottom line, though, is you need to test out the various knobs  
(mergeFactor, RAMBufferSizeMB, etc.) and see.  You may find the  
contrib/benchmark program helpful for running experiments, although it  
isn't a substitute for your actual data.

-Grant


On Jul 30, 2008, at 9:46 AM, Dragon Fly wrote:

> Perhaps I didn't explain myself clearly so please let me try it  
> again.  I'm happy with the search/indexing performance.   However,  
> my index gets fully optimized every 4 hours and the time it takes to  
> fully optimize the index is longer than I like.  Is there anything  
> that I can do to speed up the optimization? I don't fully understand  
> the different parameters (e.g. merge factor).  If I decrease the  
> merge factor, would it make the indexing slower (which I'm OK with)  
> but the optimization faster? Thank you.
>
>> Date: Tue, 29 Jul 2008 08:32:46 +0200
>> From: asbjorn@fellinghaug.com
>> To: java-user@lucene.apache.org
>> Subject: Re: Index optimization ...
>>
>> John Griffin:
>>> Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
>>> sacrifice anything. It defaults to 16.0 MB so depending on the  
>>> size of your
>>> index you may want to make it larger. Do some testing at various  
>>> values to
>>> see where the sweet spot is.
>>>
>>
>> Also, have a look at
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which  
>> provides
>> a range of helping advices in terms of enhanced indexing speed.
>>
>> -- 
>> Asbjørn A. Fellinghaug
>> asbjorn@fellinghaug.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> _________________________________________________________________
> With Windows Live for mobile, your contacts travel with you.
> http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message