hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12550) NativeIO#renameTo on Windows cannot replace an existing file at the destination.
Date Thu, 21 Apr 2016 16:33:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252167#comment-15252167
] 

Chris Nauroth commented on HADOOP-12550:
----------------------------------------

Hello [~GergelyNovak].

Thank you for the suggestion, but unfortunately, that wouldn't quite provide the expected
semantics.  Callers of rename typically have an expectation of atomicity, such that the rename
either succeeds completely or fails completely, with no visible in-between states.  With the
proposed change, if the process crashes or the host powers down after the delete, but before
the rename, then the destination file is permanently lost.  In the specific case described
in the example, a DataNode could lose a block.

There are already a few spots in the codebase where we do use a similar workaround for Windows,
but it's not ideal.  I'd prefer for the scope of this issue to be providing an atomic rename-with-replace
operation.

> NativeIO#renameTo on Windows cannot replace an existing file at the destination.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-12550
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12550
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>         Environment: Windows
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-12550.001.patch, HADOOP-12550.002.patch
>
>
> {{NativeIO#renameTo}} currently has different semantics on Linux vs. Windows if a file
already exists at the destination.  On Linux, it's a passthrough to the [rename|http://linux.die.net/man/2/rename]
syscall, which will replace an existing file at the destination.  On Windows, it's a passthrough
to [MoveFile|https://msdn.microsoft.com/en-us/library/windows/desktop/aa365239%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396],
which cannot replace an existing file at the destination and instead triggers an error.  The
easiest way to observe this difference is to run the HDFS test {{TestRollingUpgrade#testRollback}}.
 This fails on Windows due to a block recovery after truncate trying to replace a block at
an existing destination path.  This issue proposes to use [MoveFileEx|https://msdn.microsoft.com/en-us/library/windows/desktop/aa365240(v=vs.85).aspx]
on Windows with the {{MOVEFILE_REPLACE_EXISTING}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message