hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
Date Wed, 08 Apr 2015 17:23:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485581#comment-14485581

Nick Dimiduk commented on HBASE-13103:


bq. probably just like with balancer, there shoud be admin rpc call to turn balancer on/off?

Yes, that would be good. Exposure through shell would be desirable as well, and a get status
as well.

bq. Need to have "ideal" region size?

That's a good point. Probably "ideal size" is some percentage (70% ?) of the max region size,
with a close enough allowance (ie, this normalizer's target region size is 70 +/- 5% of {{hbase.hregion.max.filesize}}.

Thanks for coming around [~phobos182]!

bq. Since this operation is pretty impactful on performance...

I see this as not a single operation you run to normalize a table all at once, but rather
something that happens in the background all the time, a kind of "active anti-entropy" happening
behind the scenes to nudge a table into an ideal state. You think even a single split/merge
operation is too heavy-weight to be done without premeditation?

> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.1.0
>         Attachments: HBASE-13103-v0.patch
> Often enough, folks miss-judge split points or otherwise end up with a suboptimal number
of regions. We should have an automated, reliable way to "reshape" or "balance" a table's
region boundaries. This would be for tables that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing Balancer
that runs AssignmentManager on an interval, to run the above "reshape" operation on an interval.
That way, the cluster will automatically self-correct toward a desirable state.

This message was sent by Atlassian JIRA

View raw message