hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
Date Wed, 16 Nov 2016 10:44:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670116#comment-15670116

Anoop Sam John commented on HBASE-17110:

Interesting..   Why you think this can not be an improvement for the byTable strategy but
a new strategy?
I see not a new strategy but a config based extension of byTable strategy.   But my doubt
is why such a config needed?  Why can't the SimpleLB byTable strategy do this always?  First
preference to balance by table and then make sure the overall balance also achieved. I mean
at balance time, even if it sees by tables it is perfect, don't stop there and try find a
more perfect cluster level balance. Than a new config requirement, we can do it always.  I
can see the new config is default OFF.  May be on existing patch branches u dont want a behave
change?   At least for master, I feel at least the config to be ON by default.. And IMO the
config itself not required. Why we should not allow this better balancing to work?

> Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
> -----------------------------------------------------------------------------------------------------
>                 Key: HBASE-17110
>                 URL: https://issues.apache.org/jira/browse/HBASE-17110
>             Project: HBase
>          Issue Type: New Feature
>          Components: Balancer
>    Affects Versions: 2.0.0, 1.2.4
>            Reporter: Charlie Qiangeng Xu
>            Assignee: Charlie Qiangeng Xu
>         Attachments: HBASE-17110.patch, SimpleBalancerBytableOverall.V1
> This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy:
"bytableOverall" which could be controlled by adding:
> {noformat}
> <property>
>   <name>hbase.master.loadbalance.bytableOverall</name>
>   <value>true</value>
> </property>
> {noformat}
> We have been using the strategy on our largest cluster for several months. it's proven
to be very helpful and stable, especially, the result is quite visible to the users.
> Here is the reason why it's helpful:
> When operating large scale clusters(our case), some companies still prefer to use {{SimpleLoadBalancer}}
due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has
two modes: 
> 1. byTable, which only guarantees that the regions of one table could be uniformly distributed.

> 2. byCluster, which ignores the distribution within tables and balance the regions all
> If the pressures on different tables are different, the first byTable option is the preferable
one in most case. Yet, this choice sacrifice the cluster level balance and would cause some
servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on
server B.(real world stats)
> Consider this case,  a cluster has 3 tables and 4 servers:
> {noformat}
>   server A has 3 regions: table1:1, table2:1, table3:1
>   server B has 3 regions: table1:2, table2:2, table3:2
>   server C has 3 regions: table1:3, table2:3, table3:3
>   server D has 0 regions.
> {noformat}
> From the byTable strategy's perspective, the cluster has already been perfectly balanced
on table level. But a perfect status should be like:
> {noformat}
>   server A has 2 regions: table2:1, table3:1
>   server B has 2 regions: table1:2, table3:2
>   server C has 3 regions: table1:3, table2:3, table3:3
>   server D has 2 regions: table1:1, table2:2
> {noformat}
> We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, table2
and table3 still keep balanced.   
> And this is what the new mode "byTableOverall" can achieve.
> Two UTs have been added as well and the last one demonstrates the advantage of the new
> Also, a onConfigurationChange method has been implemented to hot control the "slop" variable.

This message was sent by Atlassian JIRA

View raw message