hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francis Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18215) some advises about refactoring of rsgroup
Date Mon, 19 Jun 2017 17:03:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054334#comment-16054334

Francis Liu commented on HBASE-18215:

#1 You don't really need to put tables on master anymore just create another regionserver
group to put the tables on. This makes meta much more available and allowing you to restart
the master when needed without causing impact. Adding the master as part of an rsgroup may
lead to operational suprises. I'd recommend let the master just do master responsibilities.

#2 Quick look at the patch and it looks like it is indeed a local file. It will make rsgroup
code simpler but you are pushing complexity to the user in the way of managing the file: persistence
in case of failure, dealing with concurrent updates, etc. Having apis aren't that complex
and are much more user-friendly and possibly more flexible. 

#3 Yes we should. Getting the rsgroup patch in took herculean effort hence I focused only
on the essentials. As Stack mentioned we need a way such that we don't cause unacceptable
leaking of rsgroup into core code.

#4 Good catch. Would you like to submit a separate patch for this?

#5 Can you provide a specific scenario? When the rsgroup patch was written if I remember correctly
the reverse was true. AM cannot handle null results when calling randomAssignment().

#6 Sounds reasonable.   

> some advises about refactoring of rsgroup
> -----------------------------------------
>                 Key: HBASE-18215
>                 URL: https://issues.apache.org/jira/browse/HBASE-18215
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>            Reporter: chenxu
>         Attachments: HBASE-18215-1.2.4-v1.patch
> recently we have Integrated rsgroup into our cluster,  after Integrated, found some refactoring
points. maybe the points were not right, but i think there is a need to share with you guys.
> # when hbase.balancer.tablesOnMaster configured, RSGroupBasedLoadBalancer should consider
masterServer assignment first in balanceCluster, roundRobinAssignment, retainAssignment and
>   do the same thing as BaseLoadBalancer
> # why not use a local file as the persistence layer instead of rsgroup table. 
> in our implementation, we first modify the local rsgroup file, then load the group info
into memory, after that execute the balancer command, everything is OK.
> when loading do some sanity check:
> (1) one server can not be owned by multi group
> (2) one table can not be owned by multi group
> (3) if group has table, it must also has servers
> (4) default group must has servers in it
> if sanity check can’t pass, give up the following process.work as this, it can greatly
reduce the complexity of rsgroup implementation, there is no need to wait for the rsgroup
table to be online, and methods like moveServers, moveTables, addRSGroup, removeRSGroup, moveServersAndTables
can be removed from RSGroupAdminService.only a refresh method is need(modify persistence layer
first and refresh the memory)
> # we should add some group informations on master web UI
> to do this, RSGroupBasedLoadBalancer should move to hbase-server module, because MasterStatusTmpl.jamon
depends on it
> # there may be some issues about RSGroupBasedLoadBalancer.roundRobinAssignment
> if two groups both include BOGUS_SERVER_NAME, assignments.putAll will overwrite the previous
> # there may be some issues about RSGroupBasedLoadBalancer.randomAssignment
> when the return value is BOGUS_SERVER_NAME, AM can not handle this case. we should return
null value instead of BOGUS_SERVER_NAME.
> # when RSGroupBasedLoadBalancer.balanceCluster execute, groups are balanced one by one,
if there are two many groups, we can do this in parallel.

This message was sent by Atlassian JIRA

View raw message