hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From weic...@apache.org
Subject hadoop git commit: HDFS-10974. Document replication factor for EC files. Contributed by Yiqun Lin.
Date Thu, 30 Mar 2017 18:17:22 GMT
Repository: hadoop
Updated Branches:
  refs/heads/trunk c9b7ce927 -> 8c591b8d1


HDFS-10974. Document replication factor for EC files. Contributed by Yiqun Lin.


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/8c591b8d
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/8c591b8d
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/8c591b8d

Branch: refs/heads/trunk
Commit: 8c591b8d199325f49b5bba29240ca25610cf80a0
Parents: c9b7ce9
Author: Wei-Chiu Chuang <weichiu@apache.org>
Authored: Thu Mar 30 11:16:05 2017 -0700
Committer: Wei-Chiu Chuang <weichiu@apache.org>
Committed: Thu Mar 30 11:16:05 2017 -0700

----------------------------------------------------------------------
 .../org/apache/hadoop/fs/shell/SetReplication.java     | 13 +++++++------
 .../hadoop-common/src/site/markdown/FileSystemShell.md |  2 +-
 .../hadoop-common/src/test/resources/testConf.xml      |  2 +-
 .../hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md |  1 +
 .../hadoop-distcp/src/site/markdown/DistCp.md.vm       |  2 +-
 5 files changed, 11 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/8c591b8d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java
index fab0349..2231c58 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/SetReplication.java
@@ -41,12 +41,13 @@ class SetReplication extends FsCommand {
   public static final String NAME = "setrep";
   public static final String USAGE = "[-R] [-w] <rep> <path> ...";
   public static final String DESCRIPTION =
-    "Set the replication level of a file. If <path> is a directory " +
-    "then the command recursively changes the replication factor of " +
-    "all files under the directory tree rooted at <path>.\n" +
-    "-w: It requests that the command waits for the replication " +
-    "to complete. This can potentially take a very long time.\n" +
-    "-R: It is accepted for backwards compatibility. It has no effect.";
+      "Set the replication level of a file. If <path> is a directory " +
+      "then the command recursively changes the replication factor of " +
+      "all files under the directory tree rooted at <path>. " +
+      "The EC files will be ignored here.\n" +
+      "-w: It requests that the command waits for the replication " +
+      "to complete. This can potentially take a very long time.\n" +
+      "-R: It is accepted for backwards compatibility. It has no effect.";
   
   protected short newRep = 0;
   protected List<PathData> waitList = new LinkedList<PathData>();

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8c591b8d/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md b/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
index 42fddc9..5db96eb 100644
--- a/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
@@ -647,7 +647,7 @@ setrep
 
 Usage: `hadoop fs -setrep [-R] [-w] <numReplicas> <path> `
 
-Changes the replication factor of a file. If *path* is a directory then the command recursively
changes the replication factor of all files under the directory tree rooted at *path*.
+Changes the replication factor of a file. If *path* is a directory then the command recursively
changes the replication factor of all files under the directory tree rooted at *path*. The
EC files will be ignored when executing this command.
 
 Options:
 

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8c591b8d/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml b/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
index 112aea0..6347aa0 100644
--- a/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
+++ b/hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
@@ -778,7 +778,7 @@
         </comparator>
         <comparator>
           <type>RegexpComparator</type>
-          <expected-output>^\s*rooted at &lt;path&gt;\.( )*</expected-output>
+          <expected-output>^\s*rooted at &lt;path&gt;\. The EC files will be
ignored here\.( )*</expected-output>
         </comparator>
         <comparator>
             <type>RegexpComparator</type>

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8c591b8d/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md
index 04acdce..f0c487d 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSErasureCoding.md
@@ -23,6 +23,7 @@ Purpose
   However, for warm and cold datasets with relatively low I/O activities, additional block
replicas are rarely accessed during normal operations, but still consume the same amount of
resources as the first replica.
 
   Therefore, a natural improvement is to use Erasure Coding (EC) in place of replication,
which provides the same level of fault-tolerance with much less storage space. In typical
Erasure Coding (EC) setups, the storage overhead is no more than 50%.
+  Replication factor of an EC file is meaningless. It is always 1 and cannot be changed via
-setrep command.
 
 Background
 ----------

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8c591b8d/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
index dbf0e8d..41a6e94 100644
--- a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
+++ b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
@@ -217,7 +217,7 @@ Command Line Options
 
 Flag              | Description                          | Notes
 ----------------- | ------------------------------------ | --------
-`-p[rbugpcaxt]` | Preserve r: replication number b: block size u: user g: group p: permission
c: checksum-type a: ACL x: XAttr t: timestamp | When `-update` is specified, status updates
will **not** be synchronized unless the file sizes also differ (i.e. unless the file is re-created).
If -pa is specified, DistCp preserves the permissions also because ACLs are a super-set of
permissions.
+`-p[rbugpcaxt]` | Preserve r: replication number b: block size u: user g: group p: permission
c: checksum-type a: ACL x: XAttr t: timestamp | When `-update` is specified, status updates
will **not** be synchronized unless the file sizes also differ (i.e. unless the file is re-created).
If -pa is specified, DistCp preserves the permissions also because ACLs are a super-set of
permissions. The option -pr is only valid if both source and target directory are not erasure
coded.
 `-i` | Ignore failures | As explained in the Appendix, this option will keep more accurate
statistics about the copy than the default case. It also preserves logs from failed copies,
which can be valuable for debugging. Finally, a failing map will not cause the job to fail
before all splits are attempted.
 `-log <logdir>` | Write logs to \<logdir\> | DistCp keeps logs of each file it
attempts to copy as map output. If a map fails, the log output will not be retained if it
is re-executed.
 `-m <num_maps>` | Maximum number of simultaneous copies | Specify the number of maps
to copy data. Note that more maps may not necessarily improve throughput.


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


Mime
View raw message