lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: setSegmentsPerTier >= setMaxMergeAtOnce ?
Date Sun, 09 Jun 2013 12:31:49 GMT
Hi Boaz,

That's correct!

But what is "too big" of a merge is an app-level decision / requires
testing in the "real" context / depends on things like how much free
RAM the OS can dedicate to bytes read-ahead, whether you have an SSD,
whether you throttle merge rate (RateLimitedDirWrapper), etc.


Mike McCandless

http://blog.mikemccandless.com


On Sun, Jun 9, 2013 at 7:08 AM, Boaz Leskes <b.leskes@gmail.com> wrote:
> Hi Mike,
>
> Thanks for the quick answer. So if I understand correctly, collapsing tiers
> in one go leads to too many big merges. The goal is then to avoid too big
> merges which will happen if we allow complete tiers to be collapsed in one
> merge. We rather have a tier collapsed partially (and thus more
> frequently). Am I correct?
>
> Cheers,
> Boaz
>
>
>
> On Sun, Jun 9, 2013 at 12:10 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> The two settings let you decouple your tolerance for how many segments
>> are allowed to accumulate (setSegmentsPerTier), from how large a
>> single merge can be (setMaxMergeAtOnce).
>>
>> E.g. say setSegmentsPerTier is 20 and setMaxMergeAtOnce is 10.
>>
>> The 20 gives TMP a "generous" budget to allow up to 20 segments per
>> "log level" to accumulate, but at that point it will pick 10 of them
>> and merge them down at once.  At that point there are still 10
>> segments at that log level, which is fine, until another 10 segments
>> are created at that level and another merge is selected.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Sun, Jun 9, 2013 at 4:38 AM, Boaz Leskes <b.leskes@gmail.com> wrote:
>> > Hi All,
>> >
>> > I recently looked at the settings for the TieredMergedPolicy [1] and was
>> > puzzled by the note on the setSegmentsPerTier method indicating it should
>> > be equal or larger to the MaxMergeAtOnce settings, in order to not cause
>> > too many merges.
>> >
>> > I understood segments per tier to indicate the goal number of segments
>> for
>> > every segment-size tier. If a tier has more segments than that number,
>> all
>> > these segments will be likely to be merged into a single one, which will
>> > then be part of the next tier. From point of view, it's efficient to be
>> > able to collapse the tier in one merge operation. However, if
>> > the MaxMergeAtOnce is smaller then the tier size it will not be able to
>> do
>> > it in one merge but will take several/not produce an segment which is
>> close
>> > to the ideal size of the bigger tier.
>> >
>> > Obviously that line of though conflicts with the note of
>> > setSegmentsPerTier's JavaDocs. Do I understand the setting/merge behavior
>> > correctly?
>> >
>> > Cheers,
>> > Boaz
>> >
>> >
>> >
>> >
>> >
>> > [1]
>> >
>> http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/TieredMergePolicy.html
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message