hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
Date Sun, 18 Oct 2009 20:45:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767129#action_12767129

Aaron Kimball commented on MAPREDUCE-972:

As discussed earlier, the FileSystem API does not provide a means for operations such as rename()
to get access to a Progressable. I do not see a straightforward way to improve the S3FS /
S3N implementations without extending the FileSystem API to add operations such as {{rename(src,
dst, progress)}}.  Are you +1 on doing that?

Either way, I agree with your criticisms of the progress thread implementation. I have the
following plan for improving this:
* Make the progress thread's lifetime equal to that of the mapper. The first rename() operation
starts it, and the join() moves to close()
* Progress thread is only active when a rename() operation is underway. Use a volatile boolean
to track this state. Otherwise it just sleeps.
* Use {{Thread.interrupt()}} / {{isInterrupted()}} to interrupt the sleep in the main loop,
so that we don't have to wait the full three seconds before the thread exits.
* Add {{distcp.rename.timeout}} as a parameter which sets a max lifetime for the inner loop
of the progress thread. Default value will be 10 seconds, but if it detects that the destination
filesystem is s3n:// or s3fs://, ups this to fifteen minutes.

- Aaron

> distcp can timeout during rename operation to s3
> ------------------------------------------------
>                 Key: MAPREDUCE-972
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 0.20.1
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, MAPREDUCE-972.4.patch,
MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
> rename() in S3 is implemented as copy + delete. The S3 copy operation can perform very
slowly, which may cause task timeout.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message