hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biju Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14215) Default cost used for PrimaryRegionCountSkewCostFunction is not sufficient
Date Thu, 03 Sep 2015 20:08:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729693#comment-14729693
] 

Biju Nair commented on HBASE-14215:
-----------------------------------

Thanks [~enis] for your comments. Disabling rack awareness will enable SLB to come-up with
a better plan even with lower {{hbase.master.balancer.stochastic.primaryRegionCountCost}}.
Will try to do some tests.

Given that potential candidates are generated randomly, one would assume that "global optimum"
will be attained with multiple candidate generations and there will be no "local optimum".
No?

As we included a new cost function for primary replication skew, will taking into account
of primary replicas in the candidate generator (may be in {{RegionReplicaCandidateGenerator}})
can help keep {{hbase.master.balancer.stochastic.primaryRegionCountCost}} lower?

> Default cost used for PrimaryRegionCountSkewCostFunction is not sufficient 
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-14215
>                 URL: https://issues.apache.org/jira/browse/HBASE-14215
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>            Reporter: Biju Nair
>            Priority: Minor
>         Attachments: 14215-v1.txt
>
>
> Current multiplier of 500 used in the stochastic balancer cost function {{PrimaryRegionCountSkewCostFunction}}
to calculate the cost of  total primary replication skew doesn't seem to be sufficient to
prevent the skews (Refer HBASE-14110). We would want the default cost to be a higher value
so that skews in primary region replica has higher cost. The following is the test result
by setting the multiplier value to 10000 (same as the region replica rack cost multiplier)
on a 3 Rack 9 RS node cluster which seems to get the balancer distribute the primaries uniformly.
> *Initial Primary replica distribution - using the current multiplier*	
>  |r1n10|  102|
>  |r1n11|  85|
>  |r1n9|    88|
>  |r2n10|  120|
>  |r2n11|  120|
>  |r2n9|   124|
>  |r3n10|  135|
>  |r3n11|  124|
>  |r3n9|    129|
> *After long duration of read & writes - using current multiplier*	
> | r1n10|  102|
> | r1n11|  85|
> | r1n9|    88|
> | r2n10|  120|
> | r2n11|  120|
> | r2n9 |   124|
> | r3n10|  135|
> | r3n11|  124|
> | r3n9|    129|
> *After manual balancing* 	
> | r1n10|  102|
> | r1n11|  85|
> | r1n9|    88|
> | r2n10|  120|
> | r2n11|  120|
> | r2n9 |   124|
> | r3n10|  135|
> | r3n11|  124|
> | r3n9|    129|
> *Increased multiplier for primaryRegionCountSkewCost to 10000*	
> | r1n10|  114|
> | r1n11 | 113|
> | r1n9 |   114|
> | r2n10|  114|
> | r2n11|  114|
> | r2n9 |   113|
> | r3n10|  115|
> | r3n11|  115|
> | r3n9 |   115 |
> Setting the {{PrimaryRegionCountSkewCostFunction}} multiplier value to 10000 should help
HBase general use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message