hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3637) Region stuck in OPENED state
Date Mon, 14 Mar 2011 20:15:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006614#comment-13006614
] 

Todd Lipcon commented on HBASE-3637:
------------------------------------

2011-03-11 06:42:58,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22ea55e0f670002
Retrieved 65 byte(s) of data from znode /hbase/unassigned/1028785192 and set watcher; region=.META.,,1,
server=trek08.sf.cloudera.com,60020,1299853933073, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,301 INFO org.apache.hadoop.hbase.master.AssignmentManager: Processing
region .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,302 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region in transition
1028785192 references a server no longer up trek08.sf.cloudera.com,60020,1299853933073; letting
RIT timeout so will be assigned elsewhere
2011-03-11 06:42:58,304 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x22ea55e0f670002
Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/70236052
2011-03-11 06:42:58,305 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22ea55e0f670002
Retrieved 65 byte(s) of data from znode /hbase/unassigned/70236052 and set watcher; region=-ROOT-,,0,
server=trek10.sf.cloudera.com,60020,1299854562169, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,305 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED,
server=trek10.sf.cloudera.com,60020,1299854562169, region=70236052/-ROOT-
2011-03-11 06:42:58,307 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
Handling OPENED event for 70236052; deleting unassigned node
2011-03-11 06:42:58,308 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x22ea55e0f670002
Deleting existing unassigned node for 70236052 that is in expected state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,313 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22ea55e0f670002
Retrieved 65 byte(s) of data from znode /hbase/unassigned/70236052; data=region=-ROOT-,,0,
server=trek10.sf.cloudera.com,60020,1299854562169, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:58,315 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x22ea55e0f670002
Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/unassigned/70236052
2011-03-11 06:42:58,315 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x22ea55e0f670002
Successfully deleted unassigned node for region 70236052 in expected state RS_ZK_REGION_OPENED
2011-03-11 06:42:58,316 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
Opened region -ROOT-,,0.70236052 on trek10.sf.cloudera.com,60020,1299854562169
2011-03-11 06:42:59,097 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in
transition timed out:  .META.,,1.1028785192 state=OPENING, ts=1299854016886
2011-03-11 06:42:59,097 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has
been OPENING for too long, reassigning region=.META.,,1.1028785192
2011-03-11 06:42:59,098 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x22ea55e0f670002
Retrieved 65 byte(s) of data from znode /hbase/unassigned/1028785192; data=region=.META.,,1,
server=trek08.sf.cloudera.com,60020,1299853933073, state=RS_ZK_REGION_OPENED
2011-03-11 06:42:59,099 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Region has
transitioned to OPENED, allowing watched event handlers to process


> Region stuck in OPENED state
> ----------------------------
>
>                 Key: HBASE-3637
>                 URL: https://issues.apache.org/jira/browse/HBASE-3637
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> I don't 100% understand how this happened, but the following was observed:
> - META is in OPENED state in ZK, for a server which no longer exists
> - Handler sees that server is dead, and figures that the RIT timeout will handle it
> - RIT timeout sees that it's already in OPENED state, and assumes that the OPENED handler
will handle it
> - loops in timeout state forever, never actually getting reassigned

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message