hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks
Date Thu, 28 Mar 2013 08:11:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616129#comment-13616129
] 

Jonathan Hsieh commented on HBASE-5487:
---------------------------------------

To do a major overhaul, we need something stronger than "the code is hard to read".  I agree
that it is hard to follow (see: http://people.apache.org/~jmhsieh/hbase/120905-hbase-assignment.pdf)
but it seems to be basically working which is a pretty strong argument.  Let's compare and
point out what is wrong/broken in the current implementation and how the new design won't
have those problems.  

The spreadsheet link is my first step to enumerating semantics and distilling the set of possible
problems and things that are being guarded from races.  Any major-overhaul solution should
make sure that these operations, when issued concurrently, interact according to a sane set
of semantics in the face of failures.

bq. Only for the current document version... tables could be added

So I buy open/close as a region operation.  split/merge are multi region operations -- is
there enough state to recover from a failure?

So alter table is a region operation? Why isn't it in the state machine? 

bq. Hmm... that would require implementing region locks, and having a very large cluster.
I am talking more about unacceptable blocking of user operations, and management of expiring
locks in presense of real-life failures.

Implementing region locks is too far -- I'm asking for some back of the napkin discussionb.
 I think we need  some measurements how much throughput we can get in ZK or with a ZK-lock
implementation and compare his with # rs of watchers * # of regions * number of ops..

The current regions-in-transition (RIT) code basically assumes that an absent znode is either
closed or opened.  RIT znodes are present when the region is in the inbetween states (opening,
closing, 

bq. You mean like WAL for operations?

Yeah, we could call it an "intent" log.  It would have info so that a promoted backup master
can look in one place and complete an operation started by the downed original master.

bq. ... Also usually that would mean RSes won't be able to initiate operations (like split)
- they will have to go thru master (which I would argue is ok).

I know I've suggested something like this before.  Currently the RS initiates a split, and
does the region open/meta changes.  If there are errors, at some point the master side detects
a timeout.  An alternative would have splits initiated RS on the rs but have the master do
some kind of atomic changes to meta and region state for the 3 involved regions (parent, daughter
a and daughter b).  

bq. Depends on where we store it, but yeah these have to be transactional. Last section (very
short ) suggests using ZK, which already supports that.

We need to be careful about ZK -- since it is a network connection also, exceptions could
be failures or timeouts (which succeed but wan't able to ack).  If we can describe the properties
(durable vs erasable) and assumptions (if the wipeable ZK is source of truth, how do we make
sure the version state is recoverable without time travel?)

                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, Zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>         Attachments: Region management in Master.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s)
based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated
tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message