hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13145) In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.
Date Mon, 16 May 2016 07:01:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284211#comment-15284211
] 

Rajesh Balamohan commented on HADOOP-13145:
-------------------------------------------

Thanks for sharing the patch [~cnauroth]. I tried out the patch with hadoop-2.8 (may 16) and
I do not see any more task failures with dist-cp to S3.  I see some straggler tasks taking
time, but that is completely unrelated to this jira.

> In DistCp, prevent unnecessary getFileStatus call when not preserving metadata.
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-13145
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13145
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools/distcp
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13145.001.patch
>
>
> After DistCp copies a file, it calls {{getFileStatus}} to get the {{FileStatus}} from
the destination so that it can compare to the source and update metadata if necessary.  If
the DistCp command was run without the option to preserve metadata attributes, then this additional
{{getFileStatus}} call is wasteful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message