hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks
Date Fri, 18 Oct 2013 22:11:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799576#comment-13799576

Jonathan Hsieh commented on HBASE-5487:

Yesterday, I shared with sergey and some of the folks interested this a draft of the design
I've been working on (I'll call it the hbck-master) and a list of questions related to Sergey's
design.  Since sergey's has got master5 in the name of the doc I'll refer to it as "master5".
 He's answered some question in email but we should do technical discussions out here.  We'll
be working together to hash out holes in each others designs and potentially merge designs.


I have a lot of questions.  I'll hit the big questions first. Also would i be possible to
put a version of this up as gdoc so we can point out nits and places that need minor clarification?
  (I have a marked up physical copy version of the doc, would be easier to provide feedback).

Main Concerns:

What is a failure and how do you react to failures? I think the master5 design needs to spend
more effort  to considering failure and recovery cases. I claim there are 4 types of responses
from a networked IO operation -  two states we normally deal with ack successful, ack failed
(nack) and unknown due to timeout  that succeeded (timeout success) and unknown due to timeout
that failed (timeout failed). We have historically missed the last two timeout cases or assumed
timeout means failure nack. It seems that master5 makes the same assumptions. 

I'm very concerned about what we need to do to invalidate information cached RS information
at clients in the case of hang, and that will violate the isolation guarantees that we claim
to provide.  I really want a slice in-depth failure handling case analysis including the master
with cached rs assignments for move and something more complicated such as split or alter.

I really want more invariant specified for the FSM states.  e.g. if a region is in state X,
does it have a row in meta? does have data on the FS? is it open on another region? is it
open on only one region? I think having 8 pages of tables at the back of the master5 doc can
be more concise and precise which will help us get attempt to prove correctness.  

Clarification questions:

1)  State update coordination.  What is a "state updates from the outside"  Do RS's initiate
splitting on their own?  Maybe a picture would help so we can figure out if it is similar
or different from hbck-master's?

2) Single point of truth.  What is this truth? what the user specficied actions?  what the
rs's are reporting?  the last state we were confirmed to be at? hbck-master tries to define
what single point of truth means by defining intended, current, and actual state data with
durability properties on each kind. What do clients look at who modifies what? 

3) Table record: "if regions is out of date, it should be closed and reopened". It is not
clear in master5 how regionservers find out that they are out of date. Moreover, how do clients
talking to those RS's with stale versions know they are going to the correct RS especially
in the face of RS failures due to timeout?

4) region record: transition states.  Shouldn't be defined as part of the region record? (This
is really similar to hbck-masters current state and intended state. )

5) Note on user operations: the forgetting thing is scary to me -- in your move split example,
what happens if an RS reads state that is forgotten?  

6) table state machine. how do we guarantee clients are not writing to against out of date
region versions? (in hang situations, regions could be open on multple places -- the hung
RS and the new RS the region was assigned to and successfully opened on)  

7) region state machine.  Earlier draft hand splitting and merge cases.  Are they elided in
master5 or are not present any more. How would this get extended handle jeffrey's distributed
log replay/fast write recovery feature?  

8) logical interactions:  sounds like master5 allows concurrent operations in specfiic regions
and and specfiic table.  (e.g. it will allow moves and splits and merges on the same region).
 hbck-master (though not fully documented) only allows certain region transitions when the
table is enabled or if the table is disabled.  Are we sure we don't get into race conditions?
 What happens if disable gets issued -- its possible for someone to reopens the region and
for old clients to continue writing to it even though it is closed?

nit. 9) "in cursive" mean in italics. :) 

10) The table operations section have tables which I believe are the actions between FSM states
in the table or region fsms.  Is this correct?  Can the edges be labeled to describe which
steps these transitions correspond to?

Short doc:
nit: Design Constraints, code should: Have AM logic isolated from the persistent storage of
// I think this should be "abstracted" so we can plug in different implementations of persistent
storage of state.

> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, Zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>         Attachments: Entity management in Master - part 1.pdf, Region management in Master5.docx,
Region management in Master.pdf
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s)
based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

This message was sent by Atlassian JIRA

View raw message