hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject MR1 next steps
Date Mon, 25 Jul 2011 18:59:27 GMT
Hey gang,

We've had some discussion on what to do with regard to MR1 when MR2
gets merged in, and wanted to give you a heads up.  By MR1 I mean the
current MR implementation that uses the JobTracker, TaskTracker, etc.
On this thread (http://search-hadoop.com/m/GJliJ1uwjXu) on
mapreduce-dev@ we came to consensus that it makes sense to remove the
MR1 code from trunk (and the 23 release) and only support the MR2
implementation in 23 and going forward. In short, there are currently
three separate MR implementations and we'd like to only maintain two
(MR1 in stable, MR2 in trunk/23).  Note! MR2 supports the current job
API - users don't need to rewrite their jobs to run on MR2 - this is
about the MR *implementation* not job compatibility.  Note that the
move to MR2 will affect some APIs (eg metrics, contrib projects that
only work against MR1, etc). The current MR1 implementation will of
course remain supported in the current stable releases.

Rationale: there's a lot of cost but little gain to maintaining three
MR implementations. Getting the MR1 code in trunk in shape so that it
is comparable in reliability/performance/features to the stable MR1
code is a lot of work. Eg security is still not supported by MR1 in
trunk, doesn't look like that's getting closed out, it hasn't been
tested at scale, etc. And it is unlikely that anyone will volunteer to
do this work given that we are moving to MR2. Ie if you want to use
MR1 we'd recommend the stable release, and if you're using 23 we'd
recommend MR2, therefore, given that we wouldn't recommend anyone use
MR1 in trunk/23 it doesn't make sense to ship it.

The current plan is to remove the MR1 code from trunk after merging in MR-279.

Thanks,
Eli

Mime
View raw message