hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6012) AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
Date Wed, 16 May 2012 05:16:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276470#comment-13276470
] 

ramkrishna.s.vasudevan commented on HBASE-6012:
-----------------------------------------------

@Chunhuui
We observed similar problems when we tried to work on HBASE-5917.  Hence in HBASE-5917 we
have tried to handle one such issue.  We are trying to see if any such problems can happen.
Because in normal assign flow we have many conditions added for various scenarios but the
same doesn't apply in bulk assign.  Will keep you updated on that.
                
> AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6012
>                 URL: https://issues.apache.org/jira/browse/HBASE-6012
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6012.patch
>
>
> As the javadoc of method and the log message
> {code}
> /**
>    * Set region as OFFLINED up in zookeeper asynchronously.
>    */
> boolean asyncSetOfflineInZooKeeper(
> ...
> master.abort("Unexpected ZK exception creating/setting node OFFLINE", e);
> ...
> }
> {code}
> I think AssignmentManager#asyncSetOfflineInZooKeeper should also force node offline,
just like AssignmentManager#setOfflineInZooKeeper do. Otherwise, it may cause bulk assign
failed which called this method.
> Error log on the master caused by the issue
> 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing
OFFLINE; was=writetest,1YTQDPGLXBTICHOPQ6IL,1336590857771.674da422fc7cb9a7d42c74499ace1d93.
state=PENDING_CLOSE, ts=1336757876856 
> 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x23736bf74780082
Async create of unassigned node for 674da422fc7cb9a7d42c74499ace1d93 with OFFLINE state 
> 2012-05-12 01:40:09,446 WARN org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rc != 0 for /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93 -- retryable connectionloss
-- FIX see http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A2 
> 2012-05-12 01:40:09,447 FATAL org.apache.hadoop.hbase.master.HMaster: Connectionloss
writing unassigned at /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93, rc=-110 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message