hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Honghua (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks
Date Thu, 10 Oct 2013 14:18:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791520#comment-13791520
] 

Feng Honghua commented on HBASE-5487:
-------------------------------------

Since HBASE-9726 is closed as duplicated with this one, I copied the proposal of HBASE-9726
here for discussion/reference:

Current assignment process (also split process) relies on ZK for the communication between
master and regionserver. This pattern has two drawbacks: 
  1. For cluster with big number of regions(say, 10K-100K regions), ZK becomes the bottleneck
for cluster restart since the assignment/split status/progress is stored in ZK due to ZK's
limited write throughput 
  2. Since ZK's watch is one-time and the event notification/process is asynchronous, there
is no guarantee for master(the watcher) to be notified of the up-to-date status/progress in
time, thereby master relies on idempotence for its correctness, which makes the logic/code
very hard to understand/maintain 

A new assignment design proposal is as below: 
  1. Assignment/split status/progress is stored in a system table(say 'assignTable') as meta
table rather than ZK to improve the write throughput, hence to improve the proformance of
restart for cluster with large number of regions. 
  2. The communication pattern for assignment/split is changed this way: master talks directly
with regionserver(master issues assign request to regionserver, regionserver responses the
assign progress to master) and records the status/progress of each assignment/split in the
'assignTable', in case of master failure, new active master reads the 'assignTable' to rebuilds
the knowledge of the ongoing assignmeng/split tasks and continues from that knowledge. (regionserver
doesn't write to the 'assignTable') 

> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, Zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>            Priority: Critical
>         Attachments: Region management in Master.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s)
based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated
tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message