Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 102A118C53 for ; Wed, 23 Sep 2015 00:50:23 +0000 (UTC) Received: (qmail 32593 invoked by uid 500); 23 Sep 2015 00:50:08 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 32420 invoked by uid 500); 23 Sep 2015 00:50:08 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 32398 invoked by uid 99); 23 Sep 2015 00:50:08 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Sep 2015 00:50:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B5B33C0CC3; Wed, 23 Sep 2015 00:50:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.026 X-Spam-Level: X-Spam-Status: No, score=-0.026 tagged_above=-999 required=6.31 tests=[RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.006, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id TVflIUN8DH1u; Wed, 23 Sep 2015 00:50:01 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id B74B420F60; Wed, 23 Sep 2015 00:50:01 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP; 22 Sep 2015 17:50:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,575,1437462000"; d="scan'208";a="566603146" Received: from orsmsx110.amr.corp.intel.com ([10.22.240.8]) by FMSMGA003.fm.intel.com with ESMTP; 22 Sep 2015 17:50:01 -0700 Received: from orsmsx112.amr.corp.intel.com (10.22.240.13) by ORSMSX110.amr.corp.intel.com (10.22.240.8) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 22 Sep 2015 17:50:00 -0700 Received: from orsmsx110.amr.corp.intel.com ([169.254.3.250]) by ORSMSX112.amr.corp.intel.com ([169.254.12.30]) with mapi id 14.03.0248.002; Tue, 22 Sep 2015 17:50:00 -0700 From: "Gangumalla, Uma" To: "hdfs-dev@hadoop.apache.org" , "common-dev@hadoop.apache.org" Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk Thread-Topic: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk Thread-Index: AQHQ9YflBrD3XqJ/nUGEfEnXH1s8r55JSH8A Date: Wed, 23 Sep 2015 00:50:00 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.114.168] Content-Type: text/plain; charset="us-ascii" Content-ID: <10C9A3677B84EF41A61CA41BE1B567E9@intel.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 +1=20 Great addition to HDFS. Thanks all contributors for the nice work. Regards, Uma On 9/22/15, 3:40 PM, "Zhe Zhang" wrote: >Hi, > >I'd like to propose a vote to merge the HDFS-7285 feature branch back to >trunk. Since November 2014 we have been designing and developing this >feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have >committed approximately 210 patches. > >The HDFS-7285 feature branch was created to support the first phase of >HDFS >erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly >reduce storage space usage in HDFS clusters. Instead of always creating 3 >replicas of each block with 200% storage space overhead, HDFS-EC provides >data durability through parity data blocks. With most EC configurations, >the storage overhead is no more than 50%. Based on profiling results of >production clusters, we decided to support EC with the striped block >layout >in the first phase, so that small files can be better handled. This means >dividing each logical HDFS file block into smaller units (striping cells) >and spreading them on a set of DataNodes in round-robin fashion. Parity >cells are generated for each stripe of original data cells. We have made >changes to NameNode, client, and DataNode to generalize the block concept >and handle the mapping between a logical file block and its internal >storage blocks. For further details please see the design doc on >HDFS-7285. >HADOOP-11264 focuses on providing flexible and high-performance codec >calculation support. > >The nightly Jenkins job of the branch has reported several successful >runs, >and doesn't show new flaky tests compared with trunk. We have posted >several versions of the test plan including both unit testing and cluster >testing, and have executed most tests in the plan. The most basic >functionalities have been extensively tested and verified in several real >clusters with different hardware configurations; results have been very >stable. We have created follow-on tasks for more advanced error handling >and optimization under the umbrella HDFS-8031. We also plan to implement >or >harden the integration of EC with existing features such as WebHDFS, >snapshot, append, truncate, hflush, hsync, and so forth. > >Development of this feature has been a collaboration across many companies >and institutions. I'd like to thank J. Andreina, Takanobu Asanuma, >Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi >Liu, >Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo >Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng >for their code contributions and reviews. Andrew and Kai Zheng also made >fundamental contributions to the initial design. Rui Li, Gao Rui, Kai >Sasaki, Kai Zheng and many other contributors have made great efforts in >system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and >ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for >providing >helpful feedbacks. > >Following the community convention, this vote will last for 7 days (ending >September 29th). Votes from Hadoop committers are binding but non-binding >votes are very welcome as well. And here's my non-binding +1. > >Thanks, >--- >Zhe Zhang