hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ka...@apache.org
Subject [34/50] [abbrv] hadoop git commit: HDFS-11040. Add documentation for HDFS-9820 distcp improvement. Contributed by Yongjun Zhang.
Date Wed, 26 Oct 2016 18:31:00 GMT
HDFS-11040. Add documentation for HDFS-9820 distcp improvement. Contributed by Yongjun Zhang.


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/0f0c15f7
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/0f0c15f7
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/0f0c15f7

Branch: refs/heads/YARN-4752
Commit: 0f0c15f7a5ea33ced781978bea971f3750883f41
Parents: 3a60573
Author: Yongjun Zhang <yzhang@cloudera.com>
Authored: Mon Oct 24 16:29:43 2016 -0700
Committer: Yongjun Zhang <yzhang@cloudera.com>
Committed: Tue Oct 25 12:25:40 2016 -0700

----------------------------------------------------------------------
 hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/0f0c15f7/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
index e9cfdc7..40c6b04 100644
--- a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
+++ b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
@@ -233,7 +233,8 @@ Flag              | Description                          | Notes
 `-bandwidth` | Specify bandwidth per map, in MB/second. | Each map will be restricted to
consume only the specified bandwidth. This is not always exact. The map throttles back its
bandwidth consumption during a copy, such that the **net** bandwidth used tends towards the
specified value.
 `-atomic {-tmp <tmp_dir>}` | Specify atomic commit, with optional tmp directory. |
`-atomic` instructs DistCp to copy the source data to a temporary target location, and then
move the temporary target to the final-location atomically. Data will either be available
at final target in a complete and consistent form, or not at all. Optionally, `-tmp` may be
used to specify the location of the tmp-target. If not specified, a default is chosen. **Note:**
tmp_dir must be on the final target cluster.
 `-async` | Run DistCp asynchronously. Quits as soon as the Hadoop Job is launched. | The
Hadoop Job-id is logged, for tracking.
-`-diff <fromSnapshot> <toSnapshot>` | Use snapshot diff report between given
two snapshots to identify the difference between source and target. | This option is valid
only with `-update` option and the following conditions should be satisfied. 1. Both the source
and target FileSystem must be DistributedFileSystem. 2. Two snapshots (e.g., s1 and s2) have
been created on the source FS. The diff between these two snapshots will be copied to the
target FS. 3. The target has the same snapshot s1. No changes have been made on the target
since s1. All the files/directories in the target are the same with source.s1. |
+`-diff <oldSnapshot> <newSnapshot>` | Use snapshot diff report between given
two snapshots to identify the difference between source and target, and apply the diff to
the target to make it in sync with source. | This option is valid only with `-update` option
and the following conditions should be satisfied. <ol><li> Both the source and
the target FileSystem must be DistributedFileSystem.</li> <li> Two snapshots `<oldSnapshot>`
and `<newSnapshot>` have been created on the source FS, and `<oldSnapshot>` is
older than `<newSnapshot>`. </li> <li> The target has the same snapshot
`<oldSnapshot>`. No changes have been made on the target since `<oldSnapshot>`
was created, thus `<oldSnapshot>` has the same content as the current state of the target.
All the files/directories in the target are the same with source's `<oldSnapshot>`.</li></ol>
|
+`-rdiff <newSnapshot> <oldSnapshot>` | Use snapshot diff report between given
two snapshots to identify what has been changed on the target since the snapshot `<oldSnapshot>`
was created on the target, and apply the diff reversely to the target, and copy modified files
from the source's `<oldSnapshot>`, to make the target the same as `<oldSnapshot>`.
| This option is valid only with `-update` option and the following conditions should be satisfied.
<ol><li>Both the source and the target FileSystem must be DistributedFileSystem.
The source and the target can be two different clusters/paths, or they can be exactly the
same cluster/path. In the latter case, modified files are copied from target's `<oldSnapshot>`
to target's current state).</li>  <li> Two snapshots `<newSnapshot>` and
`<oldSnapshot>` have been created on the target FS, and `<oldSnapshot>` is older
than `<newSnapshot>`. No change has been made on target since `<newSnapshot>`
was created on the target. </li> <li> The sour
 ce has the same snapshot `<oldSnapshot>`, which has the same content as the `<oldSnapshot>`
on the target. All the files/directories in the target's `<oldSnapshot>` are the same
with source's `<oldSnapshot>`.</li> </ol> |
 `-numListstatusThreads` | Number of threads to use for building file listing | At most 40
threads.
 `-skipcrccheck` | Whether to skip CRC checks between source and target paths. |
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


Mime
View raw message