hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (Resolved) (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-4455) Rolling restart RSs scenario, -ROOT-, .META. regions are lost in AssignmentManager
Date Wed, 28 Sep 2011 21:42:46 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ming Ma resolved HBASE-4455.
----------------------------

    Resolution: Fixed
    
> Rolling restart RSs scenario, -ROOT-, .META. regions are lost in AssignmentManager
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-4455
>                 URL: https://issues.apache.org/jira/browse/HBASE-4455
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.92.0
>
>
> Keep Master up all the time, do rolling restart of RSs like this - stop RS1, wait for
2 seconds, stop RS2, start RS1, wait for 2 seconds, stop RS3, start RS2, wait for 2 seconds,
etc. After a while, you will find the -ROOT-, .META. regions aren't in "regions in transtion"
from AssignmentManager point of view, but they aren't assigned to any regions. Here are the
issues.
> 1. .-ROOT- or .META. location is stale when MetaServerShutdownHandler is invoked to check
if it contains -ROOT- region. That is due to long delay from ZK notification and async nature
of the system. Here is an example, even though new root region server sea-lab-1,60020,1316380133656
is set at T2, at T3 the shutdown process for sea-lab-1,60020,1316380133656, the root location
still points to old server sea-lab-3,60020,1316380037898.
> T1: 2011-09-18 14:08:52,470 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:6
> 0000-0x1327e43175e0000 Retrieved 29 byte(s) of data from znode /hbase/root-regio
> n-server and set watcher; sea-lab-3,60020,1316380037898
> T2: 2011-09-18 14:08:57,173 INFO org.apache.hadoop.hbase.catalog.RootLocationEditor:
Setting ROOT region location in ZooKeeper as sea-lab-1,60020,1316380133656
> T3: 2011-09-18 14:10:26,393 DEBUG org.apache.hadoop.hbase.master.ServerManager: Adde
> d=sea-lab-1,60020,1316380133656 to dead servers, submitted shutdown handler to be executed,
root=false, meta=true, current Root Location: sea-lab-3,60020,1316380037898
> T4: 2011-09-18 14:12:37,314 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:6
> 0000-0x1327e43175e0000 Retrieved 29 byte(s) of data from znode /hbase/root-region-server
and set watcher; sea-lab-1,60020,1316380133656
> 2. The MetaServerShutdownHandler worker thread that waits for -ROOT- or .META. availability
could be blocked. If meanwhile, the new server that -ROOT- or .META. is being assigned restarted,
another instance of MetaServerShutdownHandler is queued. Eventually, all MetaServerShutdownHandler
worker threads are filled up. It looks like HBASE-4245.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message