hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14651) Default minimum compaction size is too high
Date Wed, 21 Oct 2015 05:23:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966259#comment-14966259

stack commented on HBASE-14651:

Nice [~vrodionov] How'd you figure this? You see this in the wild? Yes on detaching from flush
size given your reasoning above... the two should not be tied.

Needs a fat release note.

[~eclark] Take a look here sir.

I always want to do up an excel spread sheet or a python script (there used to be one IIRC
but could not find it just now) or I suppose it could be java script, that might be easiest
to include actual policies, that could do compaction generations using our actual compaction
selection policy... and our configs. It would take inputs that were the pathological state
as well as optimal... and work out the write amplification a particular set of configs/policy
would result in.

> Default minimum compaction size is too high
> -------------------------------------------
>                 Key: HBASE-14651
>                 URL: https://issues.apache.org/jira/browse/HBASE-14651
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>         Attachments: HBASE-14651-v1.patch, HBASE-14651-v2.patch
> *hbase.hstore.compaction.min.size* defines minimum selection size which is always eligible
for minor compaction (no compaction ratio check is performed on such file selections). Default
size is equals to memstore flush size (128MB).  First of all, even this value is too high
for some (many) deployments, especially for write intensive, because of  a small sizes of
a memstore flushes, and if user increases memstore flush size (they usually set it to at least
256MB), they have no idea how will it impact the overall compaction process efficiency. With
256MB of minimum size to compact, compactor most of the time skips necessary file ratio checks
and this will result in increased read/write IO during compactions, because of the unbalanced
selections where relatively large files can be mixed with a newly created small store files.
I think we should set this default minimum  to 64MB and not to link it to memstore flush size
at all.     

This message was sent by Atlassian JIRA

View raw message