zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hanm <...@git.apache.org>
Subject [GitHub] zookeeper pull request #92: ZOOKEEPER-2080: Fix deadlock in dynamic reconfig...
Date Tue, 24 Jan 2017 00:39:58 GMT
Github user hanm commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/92#discussion_r97451281
  
    --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java ---
    @@ -468,31 +469,33 @@ synchronized private boolean connectOne(long sid, InetSocketAddress
electionAddr
          */
         
         synchronized void connectOne(long sid){
    +        connectOne(sid, self.getLastSeenQuorumVerifier());
    +    }
    +
    +    synchronized void connectOne(long sid, QuorumVerifier lastSeenQV){
             if (senderWorkerMap.get(sid) != null) {
    -             LOG.debug("There is a connection already for server " + sid);
    -             return;
    +            LOG.debug("There is a connection already for server " + sid);
    +            return;
             }
    -        synchronized(self) {
    -           boolean knownId = false;
    -            // Resolve hostname for the remote server before attempting to
    -            // connect in case the underlying ip address has changed.
    -            self.recreateSocketAddresses(sid);
    -            if (self.getView().containsKey(sid)) {
    -               knownId = true;
    -                if (connectOne(sid, self.getView().get(sid).electionAddr))
    -                   return;
    -            } 
    -            if (self.getLastSeenQuorumVerifier()!=null && self.getLastSeenQuorumVerifier().getAllMembers().containsKey(sid)
    -                   && (!knownId || (self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr
!=
    -                   self.getView().get(sid).electionAddr))) {
    -               knownId = true;
    -                if (connectOne(sid, self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr))
    -                   return;
    -            } 
    -            if (!knownId) {
    -                LOG.warn("Invalid server id: " + sid);
    +        boolean knownId = false;
    +        // Resolve hostname for the remote server before attempting to
    +        // connect in case the underlying ip address has changed.
    +        self.recreateSocketAddresses(sid);
    +        if (self.getView().containsKey(sid)) {
    --- End diff --
    
    @shralex Thanks for review comments! Made two changes:
    
    * Refactored the code to reuse getView results. This view is not passed in as I thought
that's simplified caller site.
    * This code block inside connectOne is now synchronized with the same lock that protecting
other view / quorum verifiers of the same QuorumPeer. I think this makes the code block semantically
equivalent to the previous code block before this change, where the code block was synchronizing
on the whole QuorumPeer 'self' with the intention that during the entire execution of connectOne,
accesses to configs are protected. I did not add any comments as with the explicit synchronizing
block, the semantic should be self explanatory. 
    
    My stress tests look good so far with latest changes.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message