hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-7172) TestSplitLogManager.testVanishingTaskZNode() fails when run individually and is flaky
Date Wed, 28 Nov 2012 21:41:59 GMT

     [ https://issues.apache.org/jira/browse/HBASE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enis Soztutar updated HBASE-7172:
---------------------------------

    Status: Patch Available  (was: Open)
    
> TestSplitLogManager.testVanishingTaskZNode() fails when run individually and is flaky
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-7172
>                 URL: https://issues.apache.org/jira/browse/HBASE-7172
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: hbase-7172_v1.patch, hbase-7172_v2-0.94.patch, hbase-7172_v2.patch
>
>
> TestSplitLogManager.testVanishingTaskZNode fails when run individually (run just that
test case from eclipse). I've also noticed that it is flaky on windows. 
> The reason is a rare race condition, which somehow does not happen that much when the
whole class is run.
> The sequence of events is smt like this:
>  - we create 1 log file to split
>  - we call splitLogDistributed() in its own thread. 
>  - splitLogDistributed() is waiting in waitForSplittingCompletion() since there are no
splitlogworkers, it keep waiting.
>  - we delete the task znode from zk
>  - SplitLogManager receives the zk callback from GetDataAsyncCallback, which will call
setDone() and mark the task as success. 
>  - However, meanwhile the waitForSplittingCompletion() loops sees that remainingInZK
== 0, and calls return concurrently to the above. 
>  - on return from waitForSplittingCompletion(), splitLogDistributed() fails because the
znode delete callback has not completed yet. 
> This race only happens when the last task is deleted from zk, and normally only the SplitLogManager
deletes the task znodes after processing it, so I don't think this is a production issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message