Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6811C4E9D for ; Fri, 17 Jun 2011 14:30:49 +0000 (UTC) Received: (qmail 85580 invoked by uid 500); 17 Jun 2011 14:30:47 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 85509 invoked by uid 500); 17 Jun 2011 14:30:47 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 85501 invoked by uid 99); 17 Jun 2011 14:30:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 14:30:47 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [129.93.165.11] (HELO cse-mail.unl.edu) (129.93.165.11) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 14:30:41 +0000 Received: from cse-barracuda.cse.unl.edu (cse-barracuda.unl.edu [129.93.164.185]) (authenticated bits=0) by cse-mail.unl.edu (8.14.3/8.14.3) with ESMTP id p5HEUENI018382 for ; Fri, 17 Jun 2011 09:30:19 -0500 (CDT) X-ASG-Debug-ID: 1308321014-034b5c117d287600001-IjwG86 Received: from cse.unl.edu (cse.unl.edu [129.93.165.2]) by cse-barracuda.cse.unl.edu with ESMTP id e7gzXJD69NvRGCkc for ; Fri, 17 Jun 2011 09:30:14 -0500 (CDT) X-Barracuda-Envelope-From: bbockelm@cse.unl.edu X-Barracuda-RBL-Trusted-Forwarder: 129.93.165.2 Received: from pcp088901pcs.unl.edu (pcp088901pcs.unl.edu [129.93.158.16]) (authenticated bits=0) by cse.unl.edu (8.14.4/8.14.3) with ESMTP id p5HEUEVL020908 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Fri, 17 Jun 2011 09:30:14 -0500 Content-Type: text/plain; charset=us-ascii X-Barracuda-Apparent-Source-IP: 129.93.158.16 Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Thinking about the next hadoop mainline release From: Brian Bockelman X-ASG-Orig-Subj: Re: Thinking about the next hadoop mainline release In-Reply-To: Date: Fri, 17 Jun 2011 09:30:13 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <39C6561C-9CB6-4BA5-BE31-C269590B7842@cse.unl.edu> References: To: general@hadoop.apache.org X-Mailer: Apple Mail (2.1084) X-Barracuda-Connect: cse.unl.edu[129.93.165.2] X-Barracuda-Start-Time: 1308321014 X-Barracuda-URL: http://cse-barracuda.unl.edu:8000/cgi-mod/mark.cgi X-Virus-Scanned: clamav-milter 0.97.1 at cse-mail X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=MAILTO_TO_SPAM_ADDR X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.66338 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (cse-mail.unl.edu [129.93.165.11]); Fri, 17 Jun 2011 09:30:20 -0500 (CDT) X-Virus-Status: Clean Hi Ryan, Eric, Just looked at those two for the first time in awhile. - HDFS-918 (now 1323?) doesn't seem like it's too controversial, but = does seem like there's a bit of validation left. - HDFS-347 has a long, contentious history. However, it seems that most = of the strong objections have been cleared up. Is there anyone left who = objects to it, now that it doesn't appear to bypass security? Finally, I see Todd has posted HDFS-2080 claiming some sizable = performance improvements. Would it be possible that could finish in = time for release? As a site which heavily uses random reads and high-throughput reads, I'm = very excited for this release! Brian On Jun 17, 2011, at 2:36 AM, Ryan Rawson wrote: > HDFS-918 and HDFS-347 are absolutely critical for random read > performance. The smarter sites are already running HDFS-347 (I guess > they aren't running "Hadoop" then?), and soon they will be testing and > running HDFS-918 as well. Opening 1 socket for every read just isn't > really scalable. >=20 > -ryan >=20 > On Fri, Jun 17, 2011 at 12:17 AM, Eric Baldeschwieler > wrote: >> Hi Folks, >>=20 >> I'd like to start a conversation on mainline planning and the next = release of Apache Hadoop beyond 0.22. >>=20 >> The Yahoo! Hadoop team has been working hard to complete several big = Hadoop projects, including: >>=20 >> - HDFS Federation [HDFS-1052] >> - Already merged into trunk >>=20 >> - Next Generation Map-Reduce [MR-279] >> - Passing most tests now and discussing merging into trunk >>=20 >> - The merging of our previous work on Hadoop with security into = mainline [http://yhoo.it/i9Ww8W] >> - This is mostly done, but owen and others are doing a scrub to = close out the remaining issues >>=20 >> All of these projects are now reaching a place where we would like to = combine them with the good work already in 0.22 and put out a new apache = release, perhaps 0.23. We think the best way to accomplish that is to = finish the merge in the next few weeks and then cut a release from = trunk. >>=20 >> Yahoo stands ready to help us (the Apache Hadoop Community) turn this = new release into a stable release by running it through its 9 month test = and burn in process. The result of that will be another stable release = such as 0.18, 0.20 or 0.20.203 (hadoop with security). We have Yahoo!s = support for this substantial investment because this new release will = have a great combination of new features for small and very large sites = alike: >> - New Write Pipeline - HBase support [also in 0.21 & 0.22] >> - Federation - Scale up to larger clusters and the ability to = experiment with new namenode approaches >> - Next Gen MapReduce - Scaleup, performance improvements, ability to = experiment with new processing frameworks >>=20 >> I think this effort will produce a great new Apache Hadoop release = for the community. I'm starting this thread to collect feedback and = hopefully folks' endorsement for merging in MR-279 and putting together = this new release. Feedback please? >>=20 >> Thanks, >>=20 >> E14 >>=20 >>=20