hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
Date Fri, 19 Jun 2015 16:33:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593598#comment-14593598

Nick Dimiduk commented on HBASE-13103:

Max of 250 total regions on a region server, not per table. This is a rough guideline, and
will vary based on individual cluster configuration. Yes, this is definitely related to the
1M regions ticket.

bq. 1) should check that total number of regions doesn't approach the limits of AM

Yeah, there should be some upper bound on the total number of regions, which I assume would
be something like {{$MAX_REGIONS_PER_SERVER * $NUM_SERVERS}}, where max regions per server
is configurable.

bq. 2) we don't break table into ridiculously small regions (less than N hdfs blocks?)

Generally yes, but there is the counter case example i mentioned above, where I'm new to HBase
and my "big table" is only a single region on a single host. We want the beginners to have
a good experience too. More, smaller regions spread over an overpowered cluster should result
in everything being cached and a better intro experience.

bq. do you think what's discussed here about ideal size should go there, or in subsequent

I'm fine with improvements on the normalizer algorithms going in with subsequent patches.
I think your harness here is enough to let people get started -- for instance, Nasron from
the user list thread titled "Stochastic Balancer by tables".

> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.2.0
>         Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch
> Often enough, folks miss-judge split points or otherwise end up with a suboptimal number
of regions. We should have an automated, reliable way to "reshape" or "balance" a table's
region boundaries. This would be for tables that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing Balancer
that runs AssignmentManager on an interval, to run the above "reshape" operation on an interval.
That way, the cluster will automatically self-correct toward a desirable state.

This message was sent by Atlassian JIRA

View raw message