hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-7535) Utilize Snapshot diff report for distcp
Date Tue, 16 Dec 2014 19:30:15 GMT
Jing Zhao created HDFS-7535:

             Summary: Utilize Snapshot diff report for distcp
                 Key: HDFS-7535
                 URL: https://issues.apache.org/jira/browse/HDFS-7535
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Jing Zhao
            Assignee: Jing Zhao

Currently HDFS snapshot diff report can identify file/directory creation, deletion, rename
and modification under a snapshottable directory. We can use the diff report for distcp between
the primary cluster and a backup cluster to avoid unnecessary data copy. This is especially
useful when there is a big directory rename happening in the primary cluster: the current
distcp cannot detect the rename op thus this rename usually leads to large amounts of real
data copy.

More details of the approach will come in the first comment.

This message was sent by Atlassian JIRA

View raw message