lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <ch...@manawiz.com>
Subject Dynamically varying maxBufferedDocs
Date Thu, 09 Nov 2006 18:24:17 GMT
Hi All,

Does anybody have experience dynamically varying maxBufferedDocs?  In my
app, I can never truncate docs and so work with maxFieldLength set to
Integer.MAX_VALUE.  Some documents are large, over 100 MBytes.  Most
documents are tiny.  So a fixed value of maxBufferedDocs to avoid OOM's
is too small for good ongoing performance.

It appears to me that the merging code will work fine if the initial
segment sizes vary.  E.g., a simple solution is to make
IndexWriter.flushRamSegments() public and manage this externally (for
which I already have all the needed apparatus, including size
information, the necessary thread synchronization, etc.).

A better solution might be to build a size-management option into the
maxBufferedDocs mechanism in lucene, but at least for my purposes, that
doesn' t appear necessary as a first step.

My main concern is that the mergeFactor escalation merging logic will
somehow behave poorly in the presence of dynamically varying initial
segment sizes.

I'm going to try this now, but am wondering if anybody has tried things
along these lines and might offer useful suggestions or admonitions.

Thanks for any advice,

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message