hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wan...@apache.org
Subject [25/42] hadoop git commit: HDFS-8942. Update hyperlink to rack awareness page in HDFS Architecture documentation. Contributed by Masatake Iwasaki.
Date Tue, 25 Aug 2015 17:12:36 GMT
HDFS-8942. Update hyperlink to rack awareness page in HDFS Architecture documentation. Contributed
by Masatake Iwasaki.


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/bcaf8390
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/bcaf8390
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/bcaf8390

Branch: refs/heads/YARN-1197
Commit: bcaf83902aa4d1e3e2cd26442df0a253eae7f633
Parents: b71c600
Author: Akira Ajisaka <aajisaka@apache.org>
Authored: Mon Aug 24 13:52:49 2015 +0900
Committer: Akira Ajisaka <aajisaka@apache.org>
Committed: Mon Aug 24 13:52:49 2015 +0900

----------------------------------------------------------------------
 hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt                     | 3 +++
 hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/bcaf8390/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
index 78f69fb..0b7bc90 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
+++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
@@ -1198,6 +1198,9 @@ Release 2.8.0 - UNRELEASED
 
     HDFS-8809. HDFS fsck reports under construction blocks as "CORRUPT". (jing9)
 
+    HDFS-8942. Update hyperlink to rack awareness page in HDFS Architecture
+    documentation. (Masatake Iwasaki via aajisaka)
+
 Release 2.7.2 - UNRELEASED
 
   INCOMPATIBLE CHANGES

http://git-wip-us.apache.org/repos/asf/hadoop/blob/bcaf8390/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
index aa94a2f..c441ae8 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
@@ -116,7 +116,8 @@ The placement of replicas is critical to HDFS reliability and performance.
Optim
 
 Large HDFS instances run on a cluster of computers that commonly spread across many racks.
Communication between two nodes in different racks has to go through switches. In most cases,
network bandwidth between machines in the same rack is greater than network bandwidth between
machines in different racks.
 
-The NameNode determines the rack id each DataNode belongs to via the process outlined in
[Hadoop Rack Awareness](../hadoop-common/ClusterSetup.html#HadoopRackAwareness). A simple
but non-optimal policy is to place replicas on unique racks. This prevents losing data when
an entire rack fails and allows use of bandwidth from multiple racks when reading data. This
policy evenly distributes replicas in the cluster which makes it easy to balance load on component
failure. However, this policy increases the cost of writes because a write needs to transfer
blocks to multiple racks.
+The NameNode determines the rack id each DataNode belongs to via the process outlined in
[Hadoop Rack Awareness](../hadoop-common/RackAwareness.html).
+A simple but non-optimal policy is to place replicas on unique racks. This prevents losing
data when an entire rack fails and allows use of bandwidth from multiple racks when reading
data. This policy evenly distributes replicas in the cluster which makes it easy to balance
load on component failure. However, this policy increases the cost of writes because a write
needs to transfer blocks to multiple racks.
 
 For the common case, when the replication factor is three, HDFS’s placement policy is to
put one replica on one node in the local rack, another on a different node in the local rack,
and the last on a different node in a different rack. This policy cuts the inter-rack write
traffic which generally improves write performance. The chance of rack failure is far less
than that of node failure; this policy does not impact data reliability and availability guarantees.
However, it does reduce the aggregate network bandwidth used when reading data since a block
is placed in only two unique racks rather than three. With this policy, the replicas of a
file do not evenly distribute across the racks. One third of replicas are on one node, two
thirds of replicas are on one rack, and the other third are evenly distributed across the
remaining racks. This policy improves write performance without compromising data reliability
or read performance.
 


Mime
View raw message