hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6012) AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
Date Tue, 05 Jun 2012 04:24:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289141#comment-13289141
] 

stack commented on HBASE-6012:
------------------------------

Chunhui This method is only called bulk assigning?  Why would there be a znode at all if we
are bulk assigning?  Should we do cleanup of znode state before we bulk assign?  The delete
of the znode ahead of forcing it offline makes me nervous.  If this issue only started showing
up because we added bulk assigning to SSH (hbase-5914), then maybe in SSH before we do the
bulk assign, we should be doing the clean of zk and not do this delete and then offline? 
What you think?
                
> AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6012
>                 URL: https://issues.apache.org/jira/browse/HBASE-6012
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.96.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6012.patch, HBASE-6012v2.patch
>
>
> As the javadoc of method and the log message
> {code}
> /**
>    * Set region as OFFLINED up in zookeeper asynchronously.
>    */
> boolean asyncSetOfflineInZooKeeper(
> ...
> master.abort("Unexpected ZK exception creating/setting node OFFLINE", e);
> ...
> }
> {code}
> I think AssignmentManager#asyncSetOfflineInZooKeeper should also force node offline,
just like AssignmentManager#setOfflineInZooKeeper do. Otherwise, it may cause bulk assign
failed which called this method.
> Error log on the master caused by the issue
> 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing
OFFLINE; was=writetest,1YTQDPGLXBTICHOPQ6IL,1336590857771.674da422fc7cb9a7d42c74499ace1d93.
state=PENDING_CLOSE, ts=1336757876856 
> 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x23736bf74780082
Async create of unassigned node for 674da422fc7cb9a7d42c74499ace1d93 with OFFLINE state 
> 2012-05-12 01:40:09,446 WARN org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
rc != 0 for /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93 -- retryable connectionloss
-- FIX see http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A2 
> 2012-05-12 01:40:09,447 FATAL org.apache.hadoop.hbase.master.HMaster: Connectionloss
writing unassigned at /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93, rc=-110 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message