hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MONTMORY Alain <alain.montm...@thalesgroup.com>
Subject RE: easiest way to install hadoop
Date Wed, 23 Feb 2011 09:18:14 GMT

For my point of view it is not a trivial question...

The latest "stable release" is 0.20.2 (embedded in cloudera  CH3) (and not 0.21)...
When you start with hadoop recently (end 2010 for me) you are facing "old API" depreceated,
so you start with using new API...
But in 0.20.2 not all the new API are available under mapreduce (example MultipleInput is
not available), so you try 0.21 version where it is available...
But the 0.21 seems for me not very stable (we are facing a "null pointer exeception" in framework
logs without any idea to solve it), so we scope down to 0.20.2 and we are using "Old API".

search " Re: Which version to choose" in the mailing list and follow the advice of Todd Lipton.
The "old API" are not so depreceated, they will be supported for years because there is thousand
jobs running on them. The "new API" could be used when a stable release will be up (0.22,

It is the feeback of my personal experience where i lost time trying to use the latest 0.21
version... Since i use cloudera 0.20.2+320 with old API and i don't have any problem (we are
also using Cascading to simplify MR writting with very little overhead on performance (6%)
versus native hadoop MR jobs. Overall we gain 4,65 factor versus traditionnal RDBMS approach....

Hopes this help you,



De : real great.. [mailto:greatness.hardness@gmail.com]
Envoyé : mercredi 23 février 2011 04:42
À : mapreduce-user@hadoop.apache.org
Objet : easiest way to install hadoop

Very trivial question.
Which is the easiest way to install hadoop?
i mean which distribution should i go for?? apache or cloudera?
n which is the easiest os for hadoop?


View raw message