Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 41E156057 for ; Mon, 25 Jul 2011 18:59:56 +0000 (UTC) Received: (qmail 88805 invoked by uid 500); 25 Jul 2011 18:59:54 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 88671 invoked by uid 500); 25 Jul 2011 18:59:53 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 88663 invoked by uid 99); 25 Jul 2011 18:59:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jul 2011 18:59:53 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eli@cloudera.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vx0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jul 2011 18:59:48 +0000 Received: by vxh3 with SMTP id 3so5277855vxh.35 for ; Mon, 25 Jul 2011 11:59:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.75.163 with SMTP id d3mr4321629vdw.236.1311620367889; Mon, 25 Jul 2011 11:59:27 -0700 (PDT) Received: by 10.52.185.161 with HTTP; Mon, 25 Jul 2011 11:59:27 -0700 (PDT) Date: Mon, 25 Jul 2011 11:59:27 -0700 Message-ID: Subject: MR1 next steps From: Eli Collins To: general@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hey gang, We've had some discussion on what to do with regard to MR1 when MR2 gets merged in, and wanted to give you a heads up. By MR1 I mean the current MR implementation that uses the JobTracker, TaskTracker, etc. On this thread (http://search-hadoop.com/m/GJliJ1uwjXu) on mapreduce-dev@ we came to consensus that it makes sense to remove the MR1 code from trunk (and the 23 release) and only support the MR2 implementation in 23 and going forward. In short, there are currently three separate MR implementations and we'd like to only maintain two (MR1 in stable, MR2 in trunk/23). Note! MR2 supports the current job API - users don't need to rewrite their jobs to run on MR2 - this is about the MR *implementation* not job compatibility. Note that the move to MR2 will affect some APIs (eg metrics, contrib projects that only work against MR1, etc). The current MR1 implementation will of course remain supported in the current stable releases. Rationale: there's a lot of cost but little gain to maintaining three MR implementations. Getting the MR1 code in trunk in shape so that it is comparable in reliability/performance/features to the stable MR1 code is a lot of work. Eg security is still not supported by MR1 in trunk, doesn't look like that's getting closed out, it hasn't been tested at scale, etc. And it is unlikely that anyone will volunteer to do this work given that we are moving to MR2. Ie if you want to use MR1 we'd recommend the stable release, and if you're using 23 we'd recommend MR2, therefore, given that we wouldn't recommend anyone use MR1 in trunk/23 it doesn't make sense to ship it. The current plan is to remove the MR1 code from trunk after merging in MR-279. Thanks, Eli