hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7440) ReplicationZookeeper#addPeer is racy
Date Thu, 27 Dec 2012 11:14:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539912#comment-13539912
] 

Matteo Bertozzi commented on HBASE-7440:
----------------------------------------

If you use createAndWatch() the node is not watched if it already exists,
instead if you use createNodeIfNotExistsAndWatch(), as suggested by Lars, you get the watcher
on the newly created one or on the old one.
I think that you want the second behaviour, right?

What do you expect if the node already exists, and is set to DISABLED?
I think is fine in that state, right? so, you don't need to check it or force it to ENABLED.
                
> ReplicationZookeeper#addPeer is racy
> ------------------------------------
>
>                 Key: HBASE-7440
>                 URL: https://issues.apache.org/jira/browse/HBASE-7440
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.3
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch
>
>
> While adding a peer, ReplicationZK does the znodes creation in three transactions. Create
:
> a) peers znode
> b) peerId specific znode, and
> c) peerState znode
> There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it happens
that while adding a peer, the control flows to getPeer() and step c) has not been processed,
it may results in a state where the peer will not be added. This happens while running TestMasterReplication#testCyclicReplication().
> {code}
> 2012-12-26 07:36:35,187 INFO  [RegionServer:0;p0120.XXXXX,38423,1356536179470-EventThread]
zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state already exists
and this is not a retry
> 2012-12-26 07:36:35,188 ERROR [RegionServer:0;p0120.XXXXX,38423,1356536179470-EventThread]
regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a new peer
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists
for /2/replication/peers/1/peer-state
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
> 	at org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
> 	at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
> 	at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
> 	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2012-12-26 07:36:35,188 DEBUG [RegionServer:0;p0120.XXXXX,55742,1356536171947-EventThread]
zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 byte(s) of data
from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message