hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1396) FileNotFound exception on DFS block
Date Thu, 31 May 2007 08:34:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500317
] 

dhruba borthakur commented on HADOOP-1396:
------------------------------------------

The DFSClient uses a random number generator to generate the name of the temporary file where
the latest block of the file-being-written-to is cached. The above problem could theoretically
occur if two instances of DFSClient gets the same value from the random number generator at
around the same time.

I am suspecting that "enabling speculative execution" somehow results in more number of concurrent
tasks on the same node and this increase the probability of same tmp file being used concurrently
by multiple tasks. Hence we see this problem more often when speculative-execution is switched
on.

An alternative is to use File.createTempFile. This method will fail if the file already exists,
otherwise it will be created atomically.


> FileNotFound exception on DFS block
> -----------------------------------
>
>                 Key: HADOOP-1396
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1396
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.3
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> Got a couple of exceptions of the form illustrated below. This was for a randomwriter
run (and every node in the cluster has multiple disks).
> java.io.FileNotFoundException: /tmp/dfs/data/tmp/client-8395631522349067878 (No such
file or directory)
> 	at java.io.FileInputStream.open(Native Method)
> 	at java.io.FileInputStream.(FileInputStream.java:106)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1323)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.flush(DFSClient.java:1274)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(DFSClient.java:1256)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:38)
> 	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
> 	at java.io.DataOutputStream.write(DataOutputStream.java:90)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(ChecksumFileSystem.java:402)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:38)
> 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> 	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> 	at java.io.DataOutputStream.write(DataOutputStream.java:90)
> 	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:775)
> 	at org.apache.hadoop.examples.RandomWriter$Map.map(RandomWriter.java:158)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:187)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1709)
> So it seems like the bug reported in HADOOP-758 still exists.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message