hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
Date Wed, 08 Apr 2015 07:11:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484867#comment-14484867

Mikhail Antonov commented on HBASE-13103:

[~phobos182] thanks for feedback! Very useful. I guess I have a lot of questions I'd like
to ask, if you don't mind, to better understand the real needs.

bq.  Given the time difference between when the commands were run, this could end up with
different region boundaries between the clusters – which is not desired. So I second the
idea of generates "reshaping plan" so it can be applied in the same manner on the slave cluster.

 - How strictly consistent are you master and slave clusters? How much can they diverge? Is
second cluster mostly for long-running analytics, which only dumps output in some other table?
 - So you don't have automatic splits now, as I understand, only pre-split tables? Otherwise
how are you ensuring that the region boundaries are exactly the same? What's the avg region
-  Do you want region boundaries to be exactly the same, or approximately the same?

Current patch has notion of "reshaping plan", which includes params like split point (currently
not computed though :) ).  It'd be feasible to send these plans to normalizer on the other
side (or rather, expose normalize() call, which accepts serialized reshaping plan, in master
rpc services, but, the region names wouldn't be the same anyway)

bq. Probably should think about performing a major compaction operation before the normalize
policy runs.
Yeah, that makes sense. Though I think most people run major compactions infrequently, so
making this prerequisite would change that operational practice? How often do you run major

> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.1.0
>         Attachments: HBASE-13103-v0.patch
> Often enough, folks miss-judge split points or otherwise end up with a suboptimal number
of regions. We should have an automated, reliable way to "reshape" or "balance" a table's
region boundaries. This would be for tables that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing Balancer
that runs AssignmentManager on an interval, to run the above "reshape" operation on an interval.
That way, the cluster will automatically self-correct toward a desirable state.

This message was sent by Atlassian JIRA

View raw message