lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8331) MergePolicy simulator utility
Date Thu, 24 May 2018 20:19:00 GMT


David Smiley commented on LUCENE-8331:

CC [~mikemccand] [~simonw] [~erickerickson]

I used this utility (with some other edits not in this patch) to evaluate a custom merge policy
that had a notion of "cheap" merges.  It turned out to be very successful; I may open other
issues about ways TieredMergePolicy and/or the MergeScheduler can be improved.

The main features about this simulator are:
* doesn't require actual indexing and is thus super-fast
* calculates useful stats like the average number of segments and the average write amplification
* provides a random sequence of flushed segment sizes that can be controlled in a couple ways
to make it more/less realistic depending on your environment

Some not so great parts:
* does not yet handle deletes!
* configuration tweaking of the merge policy to be tested and varying the inputs is a manual
affair, editing main() and/or makeMergePolicy().  I added some System property overrides though,
and some basic args parsing.  It's probably not realistic to expect much better given the
use of this for experimentation.

What do you think guys?

> MergePolicy simulator utility
> -----------------------------
>                 Key: LUCENE-8331
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: LUCENE-8331.patch
> This issue introduces a MergePolicy simulator utility to help evaluate the effectiveness
of a MergePolicy.  The simulator does not result in the actual indexing and merging of segments;
instead it provides some dummy constructs to MergePolicy to evaluate its decisions.  Therefore
you can do simulation runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in benchmark?
> I mentioned this recently here:

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message