hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3
Date Mon, 19 Oct 2009 18:29:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767440#action_12767440
] 

Aaron Kimball commented on MAPREDUCE-972:
-----------------------------------------

Hm. Looking at FilterFileSystem, I think that's the most general and non-invasive solution.
I can add a FilterFS that includes the methods {{setProgressable()}} and {{setRenameTimeout()}}.
The rename operation will then manage the background progress thread in the state-machine
style described above. Since this isn't distcp-specific, I'll probably rename the config variable
to something like {{fs.progressable.rename.timeout}}. (Other suggestions welcome.)

Distcp will then wrap its {{dstfs}} in a {{ProgressFs}} instance.


> distcp can timeout during rename operation to s3
> ------------------------------------------------
>
>                 Key: MAPREDUCE-972
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 0.20.1
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, MAPREDUCE-972.4.patch,
MAPREDUCE-972.5.patch, MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can perform very
slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message