zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mkedwards <...@git.apache.org>
Subject [GitHub] zookeeper issue #707: [ZOOKEEPER-2778] QuorumPeer: encapsulate addresses
Date Tue, 20 Nov 2018 17:16:00 GMT
Github user mkedwards commented on the issue:

    The deadlock, as I understand it, comes from an attempt to take `QV_LOCK` in order to
access `myElectionAddress` in thread A, while holding the lock on the `QuorumCnxManager` object;
meanwhile, thread B, which holds the `QV_LOCK` while `connectOne()` is running, tries to take
the lock on the `QuorumCnxManager` object, producing a deadly embrace.  From the Jira:
        [junit]  java.lang.Thread.State: BLOCKED
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumPeer.getElectionAddress(QuorumPeer.java:686)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:265)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:445)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:369)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:642)
        [junit]  java.lang.Thread.State: BLOCKED
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:472)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1438)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1471)
        [junit]         at  org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:520)
        [junit]         at  org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88)
        [junit]         at  org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
    This patch removes the synchronization on `QV_LOCK` from the address accessor paths, where
its current effect seems primarily to ensure cross-thread visibility; the memory barriers
involved in setting/fetching through an `AtomicReference` are equally effective at this. 
Wrapping a single `AtomicReference<AddressTuple>` around all three addresses ensures
that you don't get interleaved sets when two threads each try to update the address fields
-- although in the code as I read it, callers are holding the `QV_LOCK` anyway.  I can't yet
say that I've analyzed all other paths that take `QV_LOCK`, but I'm fairly sure that if https://github.com/apache/zookeeper/pull/247
would work, this would work at least as well, while at least partially addressing concerns
about getting "outdated" state in read paths.  (If this is a serious concern, then presumably
we need a separate reader/writer lock that is taken during creation of the QuorumPeer and
not released until it reaches a "fully configu
 red" state for the first time.)


View raw message