hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9700) Snapshot support for distcp
Date Fri, 12 Jul 2013 07:03:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706716#comment-13706716

Jing Zhao commented on HADOOP-9700:

bq. MAPREDUCE-2257 aims to change the copy unit to block.

Thanks for leading me to this jira, Luke! I think the incremental copy case may be more complicated.
What we need includes:
1. Changing the smallest copy unit from file to block.
2. A more powerful "concat" functionality which can concatenate two (or more) files when the
last block of the first file is not full.
3. Without 2 we can also re-transfer the last block of the old file in the backup cluster.
But in that case we may need the capability to replace a block of a file.

bq. If you mean call createSnapshot inside discp then delete the snapshot after job finished,
I think it is reasonable.
Yes. And maybe to delete the snapshot can be optional since we may use the snapshot for future
incremental copy.

> Snapshot support for distcp
> ---------------------------
>                 Key: HADOOP-9700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9700
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: tools/distcp
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>         Attachments: HADOOP-9700-demo.patch
> Add snapshot incremental copy ability to distcp, so we can do iterative consistent backup
between hadoop clusters. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message