lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Solr Merge during off peak times
Date Wed, 02 May 2012 12:15:13 GMT
But again, with a master/slave setup merging should
be relatively benign. And at 200M docs, having a M/S
setup is probably indicated.

Here's a good writeup of mergepolicy

If you're indexing and searching on a single machine, merging
is much less important than how often you commit. If a M/S
situation, then you're polling interval on the slave is important.

I'd look at commit frequency long before I worried about merging,
that's usually where people shoot themselves in the foot - by
committing too often.

Overall, your mergeFactor is probably less important than other
parts of how you perform indexing/searching, but it does have
some effect for sure...


On Wed, May 2, 2012 at 7:54 AM, Prakashganesh, Prabhu
<> wrote:
> We have a fairly large scale system - about 200 million docs and fairly high indexing
activity - about 300k docs per day with peak ingestion rates of about 20 docs per sec. I want
to work out what a good mergeFactor setting would be by testing with different mergeFactor
settings. I think the default of 10 might be high, I want to try with 5 and compare. Unless
I know when a merge starts and finishes, it would be quite difficult to work out the impact
of changing mergeFactor. I want to be able to measure how long merges take, run queries during
the merge activity and see what the response times are etc..
> Thanks
> Prabhu
> -----Original Message-----
> From: Erick Erickson []
> Sent: 02 May 2012 12:40
> To:
> Subject: Re: Solr Merge during off peak times
> Why do you care? Merging is generally a background process, or are
> you doing heavy indexing? In a master/slave setup,
> it's usually not really relevant except that (with 3.x), massive merges
> may temporarily stop indexing. Is that the problem?
> Look at the merge policys, there are configurations that make
> this less painful.
> In trunk, DocumentWriterPerThread makes merges happen in the
> background, which helps the long-pause-while-indexing problem.
> Best
> Erick
> On Wed, May 2, 2012 at 7:22 AM, Prakashganesh, Prabhu
> <> wrote:
>> Ok, thanks Otis
>> Another question on merging
>> What is the best way to monitor merging?
>> Is there something in the log file that I can look for?
>> It seems like I have to monitor the system resources - read/write IOPS etc.. and
work out when a merge happened
>> It would be great if I can do it by looking at log files or in the admin UI. Do you
know if this can be done or if there is some tool for this?
>> Thanks
>> Prabhu
>> -----Original Message-----
>> From: Otis Gospodnetic []
>> Sent: 01 May 2012 15:12
>> To:
>> Subject: Re: Solr Merge during off peak times
>> Hi Prabhu,
>> I don't think such a merge policy exists, but it would be nice to have this option
and I imagine it wouldn't be hard to write if you really just base the merge or no merge decision
on the time of day (and maybe day of the week).
>> Note that this should go into Lucene, not Solr, so if you decide to contribute your
work, please see
>> Otis
>> ----
>> Performance Monitoring for Solr -
>>> From: "Prakashganesh, Prabhu" <>
>>>To: "" <>
>>>Sent: Tuesday, May 1, 2012 8:45 AM
>>>Subject: Solr Merge during off peak times
>>>  I would like to know if there is a way to configure index merge policy in solr
so that the merging happens during off peak hours. Can you please let me know if such a merge
policy configuration exists?

View raw message