hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1513) A likely race condition between the creation of a directory and checking for its existence in the DiskChecker class
Date Fri, 22 Jun 2007 05:44:26 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Devaraj Das updated HADOOP-1513:
--------------------------------

    Status: Open  (was: Patch Available)

Ok, I realized that all what I said in my last comment will hold only for an mkdir( ) call,
but we are making mkdirs( ) call (which internally makes a chain of mkdir( ) calls for each
component in the path). mkdirs( ) will return false if any mkdir( ) call returns false. So
here is a case where breaking up the expression evaluated within the 'if' statement will not
solve the problem. 
{noformat}
    dir.mkdirs();
    if (!dir.exists()) {
        throw new DiskErrorException("can not create directory: " 
                                    + dir.toString());
    }
{noformat}

Two threads/processes (t1 & t2) go inside the mkdirs( ) call and t1 makes the first few
(successful) calls to mkdir( ), and then t2 gets to run. t2 will immediately return error
since the first component in the path already exists. Now t2 goes to the exists( ) call and
that might return false since the entire directory tree might have not yet been created by
t1. Thus, exception is thrown and that is not right. 

We have to make the above exists( ) check for each component in the path if mkdir( ) for that
component fails.

So we could have a custom implementation of mkdirs( ) called mkdirsExists( ) that will return
false if the following expression returns false.
{noformat}
   boolean mkdirsExists(String path) {
   ...........
       if (!component.mkdir( ) && !component.exists( ) ) {
          return false;
       }
  ..........
  }
{noformat}

Makes sense ?

> A likely race condition between the creation of a directory and checking for its existence
in the DiskChecker class
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1513
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1513
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>            Priority: Critical
>             Fix For: 0.14.0
>
>         Attachments: 1513.patch
>
>
> Got this exception in a job run. It looks like the problem is a race condition between
the creation of a directory and checking for its existence. Specifically, the line:
> if (!dir.exists() && !dir.mkdirs()), doesn't seem safe when invoked by multiple
processes at the same time. 
> 2007-06-21 07:55:33,583 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 1
> 2007-06-21 07:55:33,818 WARN org.apache.hadoop.fs.AllocatorPerContext: org.apache.hadoop.util.DiskChecker$DiskErrorException:
can not create directory: /export/crawlspace/kryptonite/ddas/dfs/data/tmp
> 	at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:26)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:211)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:248)
> 	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:276)
> 	at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:155)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:1171)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1136)
> 	at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:342)
> 	at org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.create(DistributedFileSystem.java:145)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.(ChecksumFileSystem.java:368)
> 	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443)
> 	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:254)
> 	at org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:675)
> 	at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:165)
> 	at org.apache.hadoop.examples.RandomWriter$Map.map(RandomWriter.java:137)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1740)
> 2007-06-21 07:55:33,821 WARN org.apache.hadoop.mapred.TaskTracker: Error running child

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message