hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhe Zhang <zhezh...@cloudera.com>
Subject [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
Date Tue, 22 Sep 2015 22:40:59 GMT

I'd like to propose a vote to merge the HDFS-7285 feature branch back to
trunk. Since November 2014 we have been designing and developing this
feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
committed approximately 210 patches.

The HDFS-7285 feature branch was created to support the first phase of HDFS
erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
reduce storage space usage in HDFS clusters. Instead of always creating 3
replicas of each block with 200% storage space overhead, HDFS-EC provides
data durability through parity data blocks. With most EC configurations,
the storage overhead is no more than 50%. Based on profiling results of
production clusters, we decided to support EC with the striped block layout
in the first phase, so that small files can be better handled. This means
dividing each logical HDFS file block into smaller units (striping cells)
and spreading them on a set of DataNodes in round-robin fashion. Parity
cells are generated for each stripe of original data cells. We have made
changes to NameNode, client, and DataNode to generalize the block concept
and handle the mapping between a logical file block and its internal
storage blocks. For further details please see the design doc on HDFS-7285.
HADOOP-11264 focuses on providing flexible and high-performance codec
calculation support.

The nightly Jenkins job of the branch has reported several successful runs,
and doesn't show new flaky tests compared with trunk. We have posted
several versions of the test plan including both unit testing and cluster
testing, and have executed most tests in the plan. The most basic
functionalities have been extensively tested and verified in several real
clusters with different hardware configurations; results have been very
stable. We have created follow-on tasks for more advanced error handling
and optimization under the umbrella HDFS-8031. We also plan to implement or
harden the integration of EC with existing features such as WebHDFS,
snapshot, append, truncate, hflush, hsync, and so forth.

Development of this feature has been a collaboration across many companies
and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi Liu,
Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
for their code contributions and reviews. Andrew and Kai Zheng also made
fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
Sasaki, Kai Zheng and many other contributors have made great efforts in
system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for providing
helpful feedbacks.

Following the community convention, this vote will last for 7 days (ending
September 29th). Votes from Hadoop committers are binding but non-binding
votes are very welcome as well. And here's my non-binding +1.

Zhe Zhang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message