hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1513) A likely race condition between the creation of a directory and checking for its existence in the DiskChecker class
Date Thu, 21 Jun 2007 19:01:30 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506989

Devaraj Das commented on HADOOP-1513:

Dhruba,  I was looking at the 'if' clause in conjunction with the exception that is thrown
(if the expression returns true).

The idea behind the if clause, 
if (!dir.exists() && !dir.mkdirs()), 
is to first check whether the directory exists, and, if not, create it. If the creation fails,
exception is thrown.

I think breaking the if clause into two parts solves the problem in the context of its usage.
If a race condition ever occurs, it will be this way - the first process will create the dir
successfully. The second process will not be able to do so (inside the OS kernel, things will
be atomic). In the DiskChecker.checkDir method's context, things will still work - we will
throw an exception only when we don't see the directory (we really don't need to care who
created the directory). So, yes, the reason for throwing the exception is different, but IMO
it is consistent overall. There cannot be a race condition in the exists( ) check since the
kernel provided the atomicity in the directory creation.

BTW, there are some more checks done afterwards in the method (readable/writable checks).
Those will rule out permission issues to do with processes from different users competing
with each other to create the dir (we will bail out if we discover the dir is not writable/readable).

> A likely race condition between the creation of a directory and checking for its existence
in the DiskChecker class
> -------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-1513
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1513
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>            Priority: Critical
>             Fix For: 0.14.0
>         Attachments: 1513.patch
> Got this exception in a job run. It looks like the problem is a race condition between
the creation of a directory and checking for its existence. Specifically, the line:
> if (!dir.exists() && !dir.mkdirs()), doesn't seem safe when invoked by multiple
processes at the same time. 
> 2007-06-21 07:55:33,583 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
> 2007-06-21 07:55:33,818 WARN org.apache.hadoop.fs.AllocatorPerContext: org.apache.hadoop.util.DiskChecker$DiskErrorException:
can not create directory: /export/crawlspace/kryptonite/ddas/dfs/data/tmp
> 	at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:26)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:211)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:248)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:276)
> 	at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:155)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:1171)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1136)
> 	at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:342)
> 	at org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.create(DistributedFileSystem.java:145)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.(ChecksumFileSystem.java:368)
> 	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:254)
> 	at org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:675)
> 	at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:165)
> 	at org.apache.hadoop.examples.RandomWriter$Map.map(RandomWriter.java:137)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1740)
> 2007-06-21 07:55:33,821 WARN org.apache.hadoop.mapred.TaskTracker: Error running child

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message