hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8765) split should be based on store size, not HFile size
Date Wed, 19 Jun 2013 00:14:20 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sergey Shelukhin updated HBASE-8765:

    Fix Version/s: 0.95.2
> split should be based on store size, not HFile size
> ---------------------------------------------------
>                 Key: HBASE-8765
>                 URL: https://issues.apache.org/jira/browse/HBASE-8765
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.95.1
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: 0.95.2
> I noticed that the current split behavior is rather suboptimal with regard to compactions.
On large regions, HFile size limit triggers a split. Split is followed by major compaction
to get rid of the partial reference files. However, HFile size limit is surpassed after compaction
most of the time.
> So, first we rewrite a lot of data into a new file. Then we say "Oh look! A large file!",
split the region and rewrite everything again.
> Perhaps region split should be based on store size, or incoming compaction size - large
enough compaction should be converted into splits.
> Thoughts? I think basing off store size is a simple fix, and will code it up soon if
there are no objections

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message