hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Duxbury (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-55) [hbase] Improve Master region assignment function
Date Wed, 13 Feb 2008 00:39:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568377#action_12568377

Bryan Duxbury commented on HBASE-55:

I think what we actually need to do is better define what "server load" is. After all, we're
seeking to make the set of region assignments that will cause all region servers to have the
same average load. 

So to reformulate the name of this issue a little, we need a better way to calculate total
server load. I'm thinking this should be a function of the total size of all regions. The
rationale behind this is that the bigger the region (ie, the underlying map files), the more
time it will take to do gets, puts, compacts, etc. In the long run, machines with bigger regions
will be more utilized than systems with smaller regions. 

So, to balance region assignment, we should sum up all the sizes of all the regions currently
assigned per server, calculate an average, and then reduce the load of overloaded servers
by deallocating regions from them, and increase the load of underloaded servers by adding
newly unassigned regions. 

To incorporate the daughter regions on different machines concept, we can just add an additional
check to skip the same server as was just assigned to.

> [hbase] Improve Master region assignment function
> -------------------------------------------------
>                 Key: HBASE-55
>                 URL: https://issues.apache.org/jira/browse/HBASE-55
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Bryan Duxbury
>             Fix For: 0.2.0
> We would like the master's region assignment function to take into account more factors
when choosing where to assign regions.
> - More advanced accounting of load on regionserver - memory, # requests, etc
> - Don't deploy both daughter regions to the same regionserver
> - Assign regions where the underlying DFS blocks are hosted if possible
> Please add additional ideas in comments as they come up.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message