hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Corgan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced
Date Fri, 28 Jan 2011 21:02:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988229#action_12988229

Matt Corgan commented on HBASE-3373:

Gotcha.  I guess I was thinking of it more as a quick upgrade to the current load balancer
which only looks at region count.  We store a lot of time series data, and regions that split
were left on the same server while it moved cold regions off.  I wrote a little client side
consistent hashing balancer that solved the problem in our case, but there are definitely
better ways.  Consistent hashing also binds regions to severs across cluster restarts which
helps keep regions near their last major compacted hdfs file.

Whatever balancing scheme you do use, don't you need some starting point for randomly distributing
the regions?  If no other data is available or you need a tie breaker, maybe consistent hashing
is better than round robin or purely random placement.

> Allow regions of specific table to be load-balanced
> ---------------------------------------------------
>                 Key: HBASE-3373
>                 URL: https://issues.apache.org/jira/browse/HBASE-3373
>             Project: HBase
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 0.20.6
>            Reporter: Ted Yu
>             Fix For: 0.92.0
> From our experience, cluster can be well balanced and yet, one table's regions may be
badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table creation) out of
which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified tables evenly
across the cluster. Each of such tables has number of regions many times the cluster size.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message