hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7763) Compactions not sorting based on size anymore.
Date Thu, 07 Feb 2013 01:19:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573072#comment-13573072

Lars Hofhansl commented on HBASE-7763:

So just to state the obvious: The selections depends on what metric we're trying to optimize.
We can (1) optimize for write amplification (i.e. minimizing it) or optimize for (2) read
performance (reducing the number of scanner participating in the merge scan).

For case #1 we'd pick larger files first, and for #2 we'd pick smaller files first.
Making this configurable or pluggable thus makes a lot of sense.

They actual behavior in a production setting is very hard to predict, and as I said above:
The folks from Facebook did a lot of research and measuring to come up with the current compaction
selection algorithm. I'm surprised they are quiet on this.

> Compactions not sorting based on size anymore.
> ----------------------------------------------
>                 Key: HBASE-7763
>                 URL: https://issues.apache.org/jira/browse/HBASE-7763
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.6
>         Attachments: HBASE-7763-trunk-TESTING.patch, HBASE-7763-trunk-TESTING.patch,
> Currently compaction selection is not sorting based on size.  This causes selection to
choose larger files to re-write than are needed when bulk loads are involved.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message