hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhihong Yu (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5099) ZK event thread waiting for root region assignment may block server shutdown handler for the region sever the root region was on
Date Sat, 31 Dec 2011 00:33:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177850#comment-13177850
] 

Zhihong Yu commented on HBASE-5099:
-----------------------------------

Please read through the test output of 0.92 builds 217 and 218.
With patch 5099.92, the test failure is reproducible on MacBook.

Another validation is to deploy patch 5099.92 to real clusters and see if replication works.
                
> ZK event thread waiting for root region assignment may block server shutdown handler
for the region sever the root region was on
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5099
>                 URL: https://issues.apache.org/jira/browse/HBASE-5099
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: 5099.92, ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png,
hbase-5099-v2.patch, hbase-5099-v3.patch, hbase-5099-v4.patch, hbase-5099-v5.patch, hbase-5099-v6.patch,
hbase-5099.patch
>
>
> A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  SpliLogManager
> installed the tasks asynchronously, then started to wait for them to complete.
> The task znodes were not created actually.  The requests were just queued.
> At this time, the zookeeper connection expired.  HMaster tried to recover the expired
ZK session.
> During the recovery, a new zookeeper connection was created.  However, this master became
the
> new master again.  It tried to assign root and meta.
> Because the dead RS got the old root region, the master needs to wait for the log splitting
to complete.
> This waiting holds the zookeeper event thread.  So the async create split task is never
retried since
> there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message