lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Max Pfingsthorn" <m.pfingsth...@hippo.nl>
Subject RE: Optimize completely in memory with a FSDirectory?
Date Fri, 07 Apr 2006 10:01:06 GMT
Hi all,

Sorry for the noise, it was my own fault. After a look at the sources, I saw I misinterpreted
the MaxBufferedDocs parameter.
IndexWriter.maybeMergeSegments() seems to always merge everything if it is set so high. For
my iterative updates of the index, it seems that the standard setting of 10 is very good.
This is because I close the IndexWriter after one batch of add's in order to delete stuff
the next round. Closing the IndexWriter will behave like an optimize() if MaxBufferedDocs
is set much higher than the number of added documents.

Anyway, I've learned to leave the default settings and call optimize() periodically (like
each n added documents).

However, if you do one _huge_ indexing batch, it might be nice for you to tweak this parameter
to use more memory while indexing.

Bye!
max

> -----Original Message-----
> From: Max Pfingsthorn 
> Sent: Thursday, April 06, 2006 11:25
> To: java-user@lucene.apache.org
> Subject: RE: Optimize completely in memory with a FSDirectory?
> 
> 
> Hi,
> 
> Thanks for your suggestion. I thought about the same, but 
> somehow it didn't seem like such a good idea... Now that I 
> think about it, it would take the same IO load (in terms of 
> flushing many megabytes to disk) as optimizing in memory with 
> the FSDirectory.
> 
> Another weird thing we observed here is this:
> 
> During incremental updates to a previously optimized index, 
> no matter what I set the merge factor at, it seems to 
> optimize or possibly merge much sooner than it should.
> More clearly:
> 
> I have an optimized index of around 150MB. I set merge factor 
> to 300, maxmergedocs to Interger.MAX_VALUE, minmergedocs 
> (maxbuffereddocs) to 50000 (I have 40000 docs in the index), 
> and still it merges after around 50-80 new documents. If I 
> understand merge factor right, it should not merge at all, 
> but start a new segment after 300 new documents.
> 
> Of course this is a very artificial set of parameters, but I 
> wanted to see what goes on. Could it have anything to do with 
> the fact that I close the indexwriter after each batch of 
> updates? Can anyone explain this?
> 
> max
> 
> > -----Original Message-----
> > From: Daniel Naber [mailto:lucenelist2005@danielnaber.de]
> > Sent: Wednesday, April 05, 2006 20:23
> > To: java-user@lucene.apache.org
> > Subject: Re: Optimize completely in memory with a FSDirectory?
> > 
> > 
> > On Mittwoch 05 April 2006 13:02, Max Pfingsthorn wrote:
> > 
> > > The setMaxBufferedDocs and related parameters help a lot 
> already to
> > > fully exploit my RAM when indexing, but since I'm running a 
> > fairly small
> > > index of around 40000 docs and I'm optimizing it relatively 
> > often, I was
> > > wondering if there is any way to enforce complete in-memory
> > > optimization.
> > 
> > Maybe you could use a RAMDirectory and write it to disk using 
> > IndexWriter.addIndexes() from time to time?
> > 
> > Regards
> >  Daniel
> > 
> > -- 
> > http://www.danielnaber.de
> > 
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message