hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
Date Wed, 09 Nov 2016 13:36:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650944#comment-15650944
] 

Hudson commented on HBASE-17039:
--------------------------------

SUCCESS: Integrated in Jenkins build HBase-1.2-JDK8 #62 (See [https://builds.apache.org/job/HBase-1.2-JDK8/62/])
HBASE-17039 SimpleLoadBalancer schedules large amount of invalid region (liyu: rev 14db35c620de41e0385f3260d38b8fda0116389f)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java


> SimpleLoadBalancer schedules large amount of invalid region moves
> -----------------------------------------------------------------
>
>                 Key: HBASE-17039
>                 URL: https://issues.apache.org/jira/browse/HBASE-17039
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 2.0.0, 1.3.0, 1.1.7, 1.2.4
>            Reporter: Charlie Qiangeng Xu
>            Assignee: Charlie Qiangeng Xu
>             Fix For: 2.0.0, 1.4.0, 1.2.5, 1.1.8
>
>         Attachments: HBASE-17039.patch
>
>
> After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid
region moves(more than 30k moves) fired by the balance chore. Thus we simulated the problem
and printed out the balance plan, only to find out many servers that had two regions for a
certain table(we use by table strategy), sent out both regions to other two servers that have
zero region. 
> In the SimpleLoadBalancer's balanceCluster function,
> the code block that determines the underLoadedServers might have a problem:
> {code}
>       if (load >= min && load > 0) {
>         continue; // look for other servers which haven't reached min
>       }
>       int regionsToPut = min - load;
>       if (regionsToPut == 0)
>       {
>         regionsToPut = 1;
>       }
> {code}
> if min is zero, some server that has load of zero, which equals to min would be marked
as underloaded, which would cause the phenomenon mentioned above.
> Since we increased the cluster's size to 1600+, many tables that only have 1000 regions,
now would encounter such issue.
> By fixing it up, the balance plan went back to normal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message