zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2080) ReconfigRecoveryTest fails intermittently
Date Mon, 23 Jan 2017 08:20:27 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834049#comment-15834049
] 

ASF GitHub Bot commented on ZOOKEEPER-2080:
-------------------------------------------

Github user shralex commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/92#discussion_r97266698
  
    --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java ---
    @@ -468,31 +469,33 @@ synchronized private boolean connectOne(long sid, InetSocketAddress
electionAddr
          */
         
         synchronized void connectOne(long sid){
    +        connectOne(sid, self.getLastSeenQuorumVerifier());
    +    }
    +
    +    synchronized void connectOne(long sid, QuorumVerifier lastSeenQV){
             if (senderWorkerMap.get(sid) != null) {
    -             LOG.debug("There is a connection already for server " + sid);
    -             return;
    +            LOG.debug("There is a connection already for server " + sid);
    +            return;
             }
    -        synchronized(self) {
    -           boolean knownId = false;
    -            // Resolve hostname for the remote server before attempting to
    -            // connect in case the underlying ip address has changed.
    -            self.recreateSocketAddresses(sid);
    -            if (self.getView().containsKey(sid)) {
    -               knownId = true;
    -                if (connectOne(sid, self.getView().get(sid).electionAddr))
    -                   return;
    -            } 
    -            if (self.getLastSeenQuorumVerifier()!=null && self.getLastSeenQuorumVerifier().getAllMembers().containsKey(sid)
    -                   && (!knownId || (self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr
!=
    -                   self.getView().get(sid).electionAddr))) {
    -               knownId = true;
    -                if (connectOne(sid, self.getLastSeenQuorumVerifier().getAllMembers().get(sid).electionAddr))
    -                   return;
    -            } 
    -            if (!knownId) {
    -                LOG.warn("Invalid server id: " + sid);
    +        boolean knownId = false;
    +        // Resolve hostname for the remote server before attempting to
    +        // connect in case the underlying ip address has changed.
    +        self.recreateSocketAddresses(sid);
    +        if (self.getView().containsKey(sid)) {
    --- End diff --
    
    How about passing also the last committed view so you don't need to call getView() multiple
times ?
    I know you're protecting this with the lock in QuorumPeer, but before I read the other
file I thought there may be a race because of multiple accesses to the config. A comment would
help here.
    



> ReconfigRecoveryTest fails intermittently
> -----------------------------------------
>
>                 Key: ZOOKEEPER-2080
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
>             Project: ZooKeeper
>          Issue Type: Sub-task
>            Reporter: Ted Yu
>            Assignee: Michael Han
>             Fix For: 3.5.3, 3.6.0
>
>         Attachments: jacoco-ZOOKEEPER-2080.unzip-grows-to-70MB.7z, repro-20150816.log,
threaddump.log, ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch,
ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch
>
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message