hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1885) Race condition in MiniDFSCluster shutdown
Date Wed, 12 Sep 2007 23:00:35 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526932
] 

Hadoop QA commented on HADOOP-1885:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12365668/1885.patch applied and successfully
tested against trunk revision r575016.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/744/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/744/console

> Race condition in MiniDFSCluster shutdown
> -----------------------------------------
>
>                 Key: HADOOP-1885
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1885
>             Project: Hadoop
>          Issue Type: Bug
>          Components: test
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>             Fix For: 0.15.0
>
>         Attachments: 1885.patch
>
>
> Hudson has been sporadically failing tests that start- or follow tests that start- multiple
datanodes in MiniDFSCluster, particularly on Solaris and Windows. The following appears to
be at least partially responsible (much credit to Nigel for helping to discern this).
> A common error:
> {noformat}
> java.io.IOException: Cannot remove data directory: /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/dfs/data
> 	at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:126)
> 	at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:80)
> 	at org.apache.hadoop.dfs.TestFsck.testFsckNonExistent(TestFsck.java:96)
> {noformat}
> MiniDFSCluster starts multiple DataNodes by calling DataNode::createDataNode, which creates
and starts a DataNode thread, assigns the instance to a static member, and returns the Runnable.
Of course, each call from MiniDFSCluster overwrites this instance. Since DataNode::shutdown()
calls join() on the same Thread, each subsequent join is essentially a noop after the first
DataNode finishes. When MiniDFSCluster::shutdown() returns, it may not have released its resources,
so the next MiniDFSCluster may fail to start.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message