hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6262) HDFS doesn't raise FileNotFoundException if the source of a rename() is missing
Date Fri, 25 Apr 2014 10:53:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980876#comment-13980876
] 

Steve Loughran commented on HDFS-6262:
--------------------------------------

Suresh, thanks for the link to HADOOP-6240 -I hadn't seen that. But: *every other filesystem*
considers renaming a file that doesn't exist to be an error.

Do we have any examples where failing to fault on renaming a nonexistent file is NOT an error
to flag up? 

Looking at the hadoop production source
* {{org.apache.hadoop.fs.shell.MoveCommands}} says "we have no way to know the actual error..."
and throws a {{PathIOException}}
* {{org.apache.hadoop.fs.shell.CommandWithDestination}} says "too bad we don't know why it
failed" and does the same
* {{org.apache.hadoop.io.MapFile}} raises an IOException
* {{org.apache.hadoop.tools.mapred.CopyCommitter}} raises an IOE, as does {{org.apache.hadoop.tools.mapred.RetriableFileCopyCommand}}

Similar behaviour for: 
{code}
LocalContainerLauncher, DistCpV1
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
org.apache.hadoop.mapreduce.v2.hs.HistoryServerFileSystemStateStoreService, 
...
{code}

and those that blindly assume that rename's return value doesn't need checking
{code}
JobHistoryEventHandler
TaskLog (on localFS though)
org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore

{code}

In fact. the only bit of code I can see that converts the false return code to a warning is
{{org.apache.hadoop.tools.mapred.lib.DynamicInputChunk}}

To summarise, in the Hadoop production code, in all but one case the handling of a false return
code takes two forms
# triggers the throwing of a "that failed but we don't know why" {{IOException}}
# is blissfully ignorant that the operation has failed, and has so far been lucky in avoiding
concurrency problems with their source being renamed while they weren't looking.

All of these uses benefit from having rename consistently throw a FileNotFoundException if
the source file isn't there




> HDFS doesn't raise FileNotFoundException if the source of a rename() is missing
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-6262
>                 URL: https://issues.apache.org/jira/browse/HDFS-6262
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.0
>            Reporter: Steve Loughran
>            Assignee: Akira AJISAKA
>         Attachments: HDFS-6262.2.patch, HDFS-6262.patch
>
>
> HDFS's {{rename(src, dest)}} returns false if src does not exist -all the other filesystems
raise {{FileNotFoundException}}
> This behaviour is defined in {{FSDirectory.unprotectedRenameTo()}} -the attempt is logged,
but the operation then just returns false.
> I propose changing the behaviour of {{DistributedFileSystem}} to be the same as that
of the others -and of {{FileContext}}, which does reject renames with nonexistent sources



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message