hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7172) TestSplitLogManager.testVanishingTaskZNode() fails when run individually and is flaky
Date Tue, 11 Dec 2012 23:09:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529437#comment-13529437
] 

Hudson commented on HBASE-7172:
-------------------------------

Integrated in HBase-0.94-security #86 (See [https://builds.apache.org/job/HBase-0.94-security/86/])
    HBASE-7172 TestSplitLogManager.testVanishingTaskZNode() fails when run individually and
is flaky (Revision 1414975)

     Result = SUCCESS
enis : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java

                
> TestSplitLogManager.testVanishingTaskZNode() fails when run individually and is flaky
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-7172
>                 URL: https://issues.apache.org/jira/browse/HBASE-7172
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: hbase-7172_v1.patch, hbase-7172_v2-0.94.patch, hbase-7172_v2.patch
>
>
> TestSplitLogManager.testVanishingTaskZNode fails when run individually (run just that
test case from eclipse). I've also noticed that it is flaky on windows. 
> The reason is a rare race condition, which somehow does not happen that much when the
whole class is run.
> The sequence of events is smt like this:
>  - we create 1 log file to split
>  - we call splitLogDistributed() in its own thread. 
>  - splitLogDistributed() is waiting in waitForSplittingCompletion() since there are no
splitlogworkers, it keep waiting.
>  - we delete the task znode from zk
>  - SplitLogManager receives the zk callback from GetDataAsyncCallback, which will call
setDone() and mark the task as success. 
>  - However, meanwhile the waitForSplittingCompletion() loops sees that remainingInZK
== 0, and calls return concurrently to the above. 
>  - on return from waitForSplittingCompletion(), splitLogDistributed() fails because the
znode delete callback has not completed yet. 
> This race only happens when the last task is deleted from zk, and normally only the SplitLogManager
deletes the task znodes after processing it, so I don't think this is a production issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message