Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6EEDC17A57 for ; Tue, 3 Mar 2015 02:31:03 +0000 (UTC) Received: (qmail 22418 invoked by uid 500); 3 Mar 2015 02:30:57 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 22033 invoked by uid 500); 3 Mar 2015 02:30:57 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 21991 invoked by uid 99); 3 Mar 2015 02:30:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Mar 2015 02:30:57 +0000 X-ASF-Spam-Status: No, hits=1.0 required=5.0 tests=FSL_HELO_BARE_IP_2,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of acm@hortonworks.com designates 64.78.52.184 as permitted sender) Received: from [64.78.52.184] (HELO relayvx11b.securemail.intermedia.net) (64.78.52.184) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Mar 2015 02:30:52 +0000 Received: from securemail.intermedia.net (localhost [127.0.0.1]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id CBD7153DF0; Mon, 2 Mar 2015 18:30:10 -0800 (PST) Subject: Re: Looking to a Hadoop 3 release MIME-Version: 1.0 x-echoworx-emg-received: Mon, 2 Mar 2015 18:30:10.816 -0800 x-echoworx-msg-id: 31cef809-dc70-4cef-be66-c600cd10fbb9 x-echoworx-action: delivered Received: from 10.254.155.14 ([10.254.155.14]) by emg-ca-1-1 (JAMES SMTP Server 2.3.2) with SMTP ID 950; Mon, 2 Mar 2015 18:30:10 -0800 (PST) Received: from MBX080-W4-CO-1.exch080.serverpod.net (unknown [10.224.117.101]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id 9922D53DF8; Mon, 2 Mar 2015 18:30:10 -0800 (PST) Received: from MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) by MBX080-W4-CO-1.exch080.serverpod.net (10.224.117.101) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Mon, 2 Mar 2015 18:30:09 -0800 Received: from MBX080-W4-CO-2.exch080.serverpod.net ([10.224.117.102]) by mbx080-w4-co-2.exch080.serverpod.net ([10.224.117.102]) with mapi id 15.00.1044.021; Mon, 2 Mar 2015 18:30:09 -0800 From: Arun Murthy To: "common-dev@hadoop.apache.org" , "mapreduce-dev@hadoop.apache.org" , "hdfs-dev@hadoop.apache.org" , "yarn-dev@hadoop.apache.org" Thread-Topic: Looking to a Hadoop 3 release Thread-Index: AQHQVT+Av6yyg/YGDUWwzCkuOz5PYp0KBLDw Date: Tue, 3 Mar 2015 02:30:08 +0000 Message-ID: <1425349807827.88706@hortonworks.com> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [192.175.27.14] x-source-routing-agent: Processed Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Andrew, Thanks for bringing up this discussion. I'm a little puzzled for I feel like we are rehashing the same discussion = from last year - where we agreed on a different course of action w.r.t swit= ch to JDK7. IAC, breaking compatibility for hadoop-3 is a pretty big cost - particular= ly for users such as Yahoo/Twitter/eBay who have several clusters between w= hich compatibility is paramount.=20 Now, breaking compatibility is perfectly fine over time where there is suf= ficient benefit e.g. HDFS HA or YARN in hadoop-2 (v/s hadoop-1).=20 However, I'm struggling to quantify the benefit of hadoop-3 for users for = the cost of the breakage. Given that we already agreed to put in JDK7 in 2.7, and that the classpath= is a fairly minor irritant given some existing solutions (e.g. a new defau= lt classloader), how do you quantify the benefit for users? We could just do JDK8 in hadoop-2.10 or some such, you are definitely welc= ome to run the RM role for that release. Furthermore, I'm really concerned that this will be used as an opportunity= to further break compat in more egregious ways.=20 Also, are you foreseeing more compat breaks? OTOH, if we all agree that we= should absolutely prevent compat breakages such as the client-server wire = protocol, I feel the point of a major release is kinda lost. Overall, my biggest concern is the compatibility story vis-a-vis the benef= it.=20 Thoughts? thanks, Arun ________________________________________ From: Andrew Wang Sent: Monday, March 02, 2015 3:19 PM To: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org; hdfs-dev= @hadoop.apache.org; yarn-dev@hadoop.apache.org Subject: Looking to a Hadoop 3 release Hi devs, It's been a year and a half since 2.x went GA, and I think we're about due for a 3.x release. Notably, there are two incompatible changes I'd like to call out, that will have a tremendous positive impact for our users. First, classpath isolation being done at HADOOP-11656, which has been a long-standing request from many downstreams and Hadoop users. Second, bumping the source and target JDK version to JDK8 (related to HADOOP-11090), which is important since JDK7 is EOL in April 2015 (two months from now). In the past, we've had issues with our dependencies discontinuing support for old JDKs, so this will future-proof us. Between the two, we'll also have quite an opportunity to clean up and upgrade our dependencies, another common user and developer request. I'd like to propose that we start rolling a series of monthly-ish series of 3.0 alpha releases ASAP, with myself volunteering to take on the RM and other cat herding responsibilities. There are already quite a few changes slated for 3.0 besides the above (for instance the shell script rewrite) so there's already value in a 3.0 alpha, and the more time we give downstreams to integrate, the better. This opens up discussion about inclusion of other changes, but I'm hoping to freeze incompatible changes after maybe two alphas, do a beta (with no further incompat changes allowed), and then finally a 3.x GA. For those keeping track, that means a 3.x GA in about four months. I would also like to stress though that this is not intended to be a big bang release. For instance, it would be great if we could maintain wire compatibility between 2.x and 3.x, so rolling upgrades work. Keeping branch-2 and branch-3 similar also makes backports easier, since we're likely maintaining 2.x for a while yet. Please let me know any comments / concerns related to the above. If people are friendly to the idea, I'd like to cut a branch-3 and start working on the first alpha. Best, Andrew