zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thawan Kooburat (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-1798) Fix race condition in testNormalObserverRun
Date Fri, 01 Nov 2013 06:51:17 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811090#comment-13811090

Thawan Kooburat commented on ZOOKEEPER-1798:

Just for the record, this test is not known to be flaky in our internal Jenkins (that test
our internal branch).  

I am able to repro this on my mac.  (Java 1.7.0_15, OSX 10.7.5).  When this happen, it looks
txnlog doesn't have any valid content in it.  So the zkdb that we loaded after shutting down
the observer never have txn that its znodes to "data2".   I also modified the test to leave
the data files around and try to load it manually after the test fail. The txnlog is loaded
successfully with the right content. 

I am thinking that the data flushed to disk by one thread is not visible by the other thread
even after thread.join() is called in between. However, this really seem unlikely. But I ran
the same test in our production host, I cannot repro the issue (yet)

In Patrick log, this is slightly different. The test failed at line 1105, this means that
the first txn in txnlog is read correctly, but not the second one. 

> Fix race condition in testNormalObserverRun
> -------------------------------------------
>                 Key: ZOOKEEPER-1798
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1798
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Thawan Kooburat
>            Priority: Blocker
>             Fix For: 3.4.6, 3.5.0
>         Attachments: TEST-org.apache.zookeeper.server.quorum.Zab1_0Test.txt, ZOOKEEPER-1798-b3.4.patch,
ZOOKEEPER-1798-b3.4.patch, ZOOKEEPER-1798-b3.4.patch, ZOOKEEPER-1798.patch, ZOOKEEPER-1798.patch
> This is the output messges:
> <noformat>
> Testcase: testNormalObserverRun took 4.221 sec
>         FAILED
> expected:<data[2]> but was:<data[1]>
> junit.framework.AssertionFailedError: expected:<data[2]> but was:<data[1]>
>         at org.apache.zookeeper.server.quorum.Zab1_0Test$8.converseWithObserver(Zab1_0Test.java:1118)
>         at org.apache.zookeeper.server.quorum.Zab1_0Test.testObserverConversation(Zab1_0Test.java:546)
>         at org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalObserverRun(Zab1_0Test.java:994)
> <noformat>

This message was sent by Atlassian JIRA

View raw message