hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14651) Default minimum compaction size is too high
Date Mon, 02 Nov 2015 22:13:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986161#comment-14986161

stack commented on HBASE-14651:

YCSB. So, rows are number and value is 1k.
Yes, random bytes.
How you mean stats for memstore flush sizes?

I'd think that if this patch was going to make a difference, the simple YCSB with regular
keys would be where it would shine best?

> Default minimum compaction size is too high
> -------------------------------------------
>                 Key: HBASE-14651
>                 URL: https://issues.apache.org/jira/browse/HBASE-14651
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>         Attachments: HBASE-14651-v1.patch, HBASE-14651-v2.patch, bytes.png, files.png
> *hbase.hstore.compaction.min.size* defines minimum selection size which is always eligible
for minor compaction (no compaction ratio check is performed on such file selections). Default
size is equals to memstore flush size (128MB).  First of all, even this value is too high
for some (many) deployments, especially for write intensive, because of  a small sizes of
a memstore flushes, and if user increases memstore flush size (they usually set it to at least
256MB), they have no idea how will it impact the overall compaction process efficiency. With
256MB of minimum size to compact, compactor most of the time skips necessary file ratio checks
and this will result in increased read/write IO during compactions, because of the unbalanced
selections where relatively large files can be mixed with a newly created small store files.
I think we should set this default minimum  to 64MB and not to link it to memstore flush size
at all.     

This message was sent by Atlassian JIRA

View raw message