hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5344) [89-fb] Scan unassigned region directory on master failover
Date Tue, 14 Feb 2012 21:20:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208018#comment-13208018
] 

Phabricator commented on HBASE-5344:
------------------------------------

stack has commented on the revision "[jira] [HBASE-5344] [89-fb] Scan unassigned region directory
on master failover".

  Whats the state on this patch Mikhail?  You going to apply to 0.89fb?  If it goes into 0.89fb,
I'd then like to forward port it.  It looks like it could take care of some trunk issues we
see.

  Is it possible that querying the regionservers would return state that is different to what
is up in .META.? (I suppose if it does, we have bigger issues?)

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:56 Should get
via Configuration?
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:68 This does
not do retries (and it looks like down in the code you are not doing retrying of Callable).
 In TRUNK we use an HTable instance -- i.e. a Callable w/ retries -- so we get retying (thats
a big change in trunk -- doing retries rather than one-time HConnection calls)
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:51 FYI, in trunk,
hbck needs what this class does over in HBaseFSCK#processRegionServers.  It could use this
class one day.  Currently it asks master for this cluster status (which wouldn't work where
this is needed on master failover)
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:96 What is this?
 It seems fb particular?  If no regionservers in zk, then its a cluster startup which means?
 Does it mean cluster is starting?  What if there was a a regionserver up and running already
but it had not yet been assigned any regions?  Wouldn't this be a clean cluster startup too?
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:107 Yeah, this
stuff does not retry which maybe ok on startup here.
  src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:235 Nice utility
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:160 Misspelled
  src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:50 We don't have this
class in TRUNK.  Was it added to 0.89fb?
  src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:88 Why delete it?
 In case it has unassigned znodes?  I suppose this legit if the isClusterStartup means no
regionservers up on cluster.
  src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:128 ZKUtil.joinZNode
does this.

  So we are going through each of the unassigned znodes and we are going to update .META.?
 I see that in the loop, if we trip over .META., then we'll just return.  Whats that about?
 Is it that .META. is not assigned?  Is .META. and -ROOT- assigned before this method is called?

REVISION DETAIL
  https://reviews.facebook.net/D1605

                
> [89-fb] Scan unassigned region directory on master failover
> -----------------------------------------------------------
>
>                 Key: HBASE-5344
>                 URL: https://issues.apache.org/jira/browse/HBASE-5344
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Mikhail Bautin
>            Assignee: Mikhail Bautin
>         Attachments: D1605.1.patch
>
>
> In case the master dies after a regionserver writes region state as OPENED or CLOSED
in ZK but before the update is received by master and written to meta, the new master that
comes up has to pick up the region state from ZK and write it to meta. Otherwise we can get
multiply-assigned regions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message