hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
Date Mon, 05 Oct 2009 20:15:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762342#action_12762342

Aaron Kimball commented on MAPREDUCE-972:

Hmm. In HADOOP-5814, the FSDataOutputStream is given a Progressable in its c'tor, which is
then guaranteed valid up through the call to close(). Using the same Progressable in the close
method, and anywhere else in the lifetime of the OutputStream makes sense.

But in NativeS3FileSystem itself, Progressable objects are only directly provided to the create()
and append() methods. It's not guaranteed that one of these will be called before the call
to rename(). Moreover, since FileSystem implementations are cached, with JVM reuse, it's possible
that any Progressable memoized in such a method might not apply to the current task.

I don't see a way to do this for rename(), delete(), etc, without modifying the API of FileSystem
itself, which would be a pretty big change.

> distcp can timeout during rename operation to s3
> ------------------------------------------------
>                 Key: MAPREDUCE-972
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 0.20.1
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, MAPREDUCE-972.patch
> rename() in S3 is implemented as copy + delete. The S3 copy operation can perform very
slowly, which may cause task timeout.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message