hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Gummadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-654) Add an option -count to distcp for displaying some info about the src files
Date Thu, 17 Sep 2009 11:20:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756481#action_12756481

Ravi Gummadi commented on MAPREDUCE-654:

Yes. Venkatesh is working on the new option -dryrun that displays the files to be copied by
distcp also. This option will be renamed to -dryrun.

> Add an option -count to distcp for displaying some info about the src files
> ---------------------------------------------------------------------------
>                 Key: MAPREDUCE-654
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-654
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: distcp
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>         Attachments: d_count.patch, d_count654.patch, d_count_v1.patch
> Add an option -count to distcp for displaying metadata about src files like number of
files to be copied and total size of src files to be copied.
> WIth -count, distcp doesn't do any copy. Just displays info and exits.
> This is useful specifically when used with -update.
>  distcp -update -count <src>* <dst> 
>       would display the number of files to be updated and the total size of copy needs
to be done(by comparing the file sizes and checksums at src and dst). Based on this info,
users could allocate the number of nodes needed for the actual update job.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message