lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Optimize taking two steps and extra disk space
Date Tue, 21 Jun 2011 14:58:44 GMT
On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey <solr@elyograg.org> wrote:
> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>
>> For back-compat, mergeFactor maps to both of these, but it's better to
>> set them directly eg:
>>
>>     <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>>       <int name="maxMergeAtOnce">10</int>
>>       <int name="segmentsPerTier">20</int>
>>     </mergePolicy>
>>
>> (and then remove your mergeFactor setting under indexDefaults)
>
> When I did this and ran a reindex, it merged once it reached 10 segments,
> despite what I had defined in the mergePolicy.  This is Solr 3.2 with the
> patch from SOLR-1972 applied.  I've included the config snippet below into
> solrconfig.xml using xinclude via another file.  I had to put mergeFactor
> back in to make it work right.  I haven't checked yet to see whether an
> optimize takes one pass.  That will be later today.
>
> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
> <int name="maxMergeAtOnce">35</int>
> <int name="segmentsPerTier">35</int>
> <int name="maxMergeAtOnceExplicit">105</int>
> </mergePolicy>

Hmm something strange is going on.

In Solr 3.2, if you attempt to use mergeFactor and useCompoundFile
inside indexDefaults (and outside the mergePolicy), when your
mergePolicy is TMP, you should see a warning like this:

  Use of compound file format or mergefactor cannot be configured if
merge policy is not an instance of LogMergePolicy. The configured
policy's defaults will be used.

And it shouldn't "work".  But, using the "right" params inside your
mergePolicy section ought to work (though, I don't think this is well
tested...).  I'm not sure why you're seeing the opposite of what I'd
expect...

I wonder if you're actually really getting the TMP?  Can you turn on
verbose IndexWriter infoStream and post the output?

Mike McCandless

http://blog.mikemccandless.com

Mime
View raw message