hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3112) Enable and disable of table needs a bit of loving in new master
Date Mon, 08 Nov 2010 20:15:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929709#action_12929709

HBase Review Board commented on HBASE-3112:

Message from: "Jonathan Gray" <jgray@apache.org>

This is an automatically generated e-mail. To reply, visit:

Looks pretty good.  Should be much improved.

What about client-side semantics and checks?  I thought there was going to be some change
there?  An async trigger and then a fast way to see if it's done or not?  Should be clear
on javadoc since enable/disable is always thing we get questions on.

I think most of my issues with patch is that it adds huge amount of new code into already
big AssignmentManager class.  Should move these classes out and not sure if we should have
these methods which just touch ZK.  I think having all logic about the state transitions,
usage of ZK, etc more explicitly controlled in the handlers themselves might be more clear
to follow how this works.  Maybe not but was a little hard to follow (that, for example, we
don't have enabling/disabling states in memory... where we were talking about in-memory state
vs zk states, etc).

Also, ZKTable can be simplified (and made more future proof) with an enum.

Now, for failover, I know we said we would punt to 0.92 on handling of this case, but we should
at least make it so we don't get into broken state if we have a master failover.  What will
happen if we are in disabling up in zk but not all closes have been done?

Lastly... I think we definitely need some unit tests on this stuff.  TestMasterFailover does
a little but not really relevant to these changes (only deals with regions already RIT). 
TestRollingRestart does an enable/disable of table.  But none of this stuff takes into account
failure of stuff in handlers, failure of RS, etc... Don't need to hold up this patch on unit
tests if it's working in cluster testing on large tables, but it's kind of thing that would
be good to test at some point.


    I thought we had this method somewhere already?


    nevermind, you just moved it :)


    Whitespace, and should make note that this is checking ZK not in-memory state?


    Are these methods necessary?  Seems like unnecessary stuff in AssignmentManager.  The
Handlers can just use the ZK methods directly?  (keep the AM methods as check/set of it's
internal state?)


    This is for repeated runs of enable?  Should we log if this actually removes regions (table
has X total regions, already Y online, assigning Z still offline)?
    It's okay that this operation is not done under any locks?  Couldn't regions be coming
online/offline concurrent with this operation?






    yeah, instead of all these permutations on isEnabledOrDisabling and what not, should just
use an enum and pass that in as argument to method which checks node state.


    What is this method's policy on watches?  Please note it in javadoc.  This method is not
at all strict and is multiple operations so is not safe to use on nodes where we rely on watches
/ monitoring of state transitions, so let's make some kind of note.

- Jonathan

> Enable and disable of table needs a bit of loving in new master
> ---------------------------------------------------------------
>                 Key: HBASE-3112
>                 URL: https://issues.apache.org/jira/browse/HBASE-3112
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.90.0
>         Attachments: 3112-v2.txt, 3112-v3.txt, 3112.txt
> The tools are in place to do a more reliable enable/disable of tables.  Some work has
been done to hack in a basic enable/disable but its not enough -- see the test avro/thrift
tests where a disable/enable/disable switchback can confuse the table state (and has been
disabled until this issue addressed).
> This issue is about finishing off enable/disable in the new master.   I think we need
to add to the table znode an enabling/disabling state rather than have them binary with a
watcher that will stop an enable (or disable) starting until the previous completes (Currently
we atomically switch the state though the region close/open lags -- some work in enable/disable
handlers helps in that they won't complete till all regions have transitioned.. but its not
> Need to add tests too.
> Marking issue critical bug because loads of the questions we get on lists are about enable/disable

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message