hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (Jira)" <j...@apache.org>
Subject [jira] [Created] (HBASE-24139) Balancer should avoid leaving idle region servers
Date Wed, 08 Apr 2020 12:57:00 GMT
Sean Busbey created HBASE-24139:
-----------------------------------

             Summary: Balancer should avoid leaving idle region servers
                 Key: HBASE-24139
                 URL: https://issues.apache.org/jira/browse/HBASE-24139
             Project: HBase
          Issue Type: Improvement
          Components: Balancer, Operability
            Reporter: Sean Busbey
            Assignee: Sean Busbey


After HBASE-15529 the StochasticLoadBalancer makes the decision to run based on its internal
cost functions rather than the simple region count skew of BaseLoadBalancer.

Given the default weights for those cost functions, the default minimum cost to indicate a
need to rebalance, and a regions per region server density of ~90 we are not very responsive
to adding additional region servers for non-trivial cluster sizes:

* For clusters ~10 nodes, the defaults think a single RS at 0 regions means we need to balance
* For clusters >20 nodes, the defaults will not consider a single RS at 0 regions to mean
we need to balance. 2 RS at 0 will cause it to balance.
* For clusters ~100 nodes, having 6 RS with no regions will still not meet the threshold to
cause a balance.

Note that this is the decision to look at balancer plans at all. The calculation is severely
dominated by the region count skew (it has weight 500 and all other weights are ~105), so
barring a very significant change in all other cost functions this condition will persist
indefinitely.

Two possible approaches:

* add a new cost function that's essentially "don't have RS with 0 regions" that an operator
can tune
* add a short circuit condition for the {{needsBalance}} method that checks for empty RS similar
to the check we do for colocated region replicas



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message