hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6240) Rename operation is not consistent between different implementations of FileSystem
Date Thu, 17 Sep 2009 19:12:57 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756707#action_12756707

Doug Cutting commented on HADOOP-6240:

> I often test these apps on the local file system. The atomic flag prevents me from doing
such tests [ ... ]

Good point.  I agree that if the local file system cannot perform atomic renames then the
atomic flag would probably be counterproductive.  When atomic is specified, we could exec
'mv', which is atomic when used within a filesystem.  We have DF#getFilesystem() that we can
use to determine if two files are on a common filesystem.  So I think we could probably could
implement the ATOMIC option correctly for the local filesystem if we wanted.

It looks to me like S3 implements atomic copy.  So you still need to remove the source as
a second step, but one can presumably tell by dates that the copy succeeded, so the failure
cases are not as bad, but I don't know if that's good enough.

More generally, would throwing an exception for filesystems where the rename cannot be done
atomically ever be useful?  Let's say that we don't feel that S3 implements atomic rename
sufficiently well.  Would it be better, when an application wants an atomic rename, to perform
the rename non-atomically or to throw an exception?  If the application's okay with non-atomic,
then they shouldn't specify ATOMIC.  So then the question becomes, should any applications
ever specify ATOMIC?  Is it ever so important that you'd rather fail than have it non-atomic?
 My guess is probably not, so perhaps we should, as you suggest, skip the ATOMIC option. 
What do others think?

Even if we only have a single option initially, OVERWRITE, we should still probably make the
method accept multiple options, to future-proof it.  Also note that, if the signature is new,
we may not need need a different name!

> Rename operation is not consistent between different implementations of FileSystem
> ----------------------------------------------------------------------------------
>                 Key: HADOOP-6240
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6240
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: 0.21.0
> The rename operation has many scenarios that are not consistently implemented across
file systems.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message