lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject Re: Solr IndexConfig and mergeFactor
Date Wed, 30 Jul 2014 06:47:07 GMT
I agree with the general statement that users may want to set such settings
without caring much about the actual MergePolicy that's used. But then I
think that we currently don't do a very good job to allow that:

   - I think mergeFactor is less clear than segmentsPerTier, as a parameter
   name which explains itself. It also hides the benefits of TieredMP, which
   lets you separately control how many segments to merge at once, vs how many
   do you allow to exist in a tier, before that tier is merged away.
   - We only allow to set this parameter outside the <mergePolicy> element,
   everything else requires that you define the MP class. So e.g. there's no
   way to control the maximum size that a segment can grow to.

So I think <mergeFactor> is there more because of what you write first "it
pre-dates lucene MPs", but I don't think it is so useful that we should
make an exception about it, and only it.

Even though the example solrconfig.xml documents what this parameter means
for LogMP and TieredMP. I think if users want to meddle with merge
settings, it is not uber-expert if we ask them to specify which MP they
want to use, and then use that MP's specific parameters. To make it easier
on users, we can support a class="Tiered/LogMergePolicy" so users don't
have to define the full class name (maybe we prefix that with
"lucene.Tiered/LogMP"), but once it's defined, they need to use the right
parameters for the chosen MP.

The existence of this parameter also causes us to document it in the
reference guide, but I think it's more important that we document the
default MergePolicy and some important settings about it (e.g.
maxMergedSegmentMB, maxMergeAtOnce and segmentsPerTier).


On Tue, Jul 29, 2014 at 10:13 PM, Yonik Seeley <>

> On Tue, Jul 29, 2014 at 11:35 AM, Chris Hostetter
> <> wrote:
> > the use of an explicit <mergeFactor> is just there for backcompat.
> Right, mergeFactor pre-dates lucene MergePolicies...
> I think some parameters like this are still useful too.  It makes
> sense to be able to set the mergeFactor or the maximum size that any
> segment can grow to without knowing (or caring) about exactly what
> merge policy is the default these days, or what it does.  Configuring
> a specific merge policy is definitely more expert level.
> -Yonik
> - native code faceting, facet functions,
> sub-facets, off-heap data
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message