hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Which release to use?
Date Fri, 15 Jul 2011 14:35:45 GMT

Yahoo! no longer has its own distribution of Hadoop.  It has been merged into the 0.20.2XX
line so 0.20.203 is what Yahoo is running internally right now, and we are moving towards
0.20.204 which should be out soon.  I am not an expert on Cloudera so I cannot really map
its releases to the Apache Releases, but their distro is based off of Apache Hadoop with a
few bug fixes and maybe a few features like append added in on top of it, but you need to
talk to Cloudera about the exact details.  For the most part they are all very similar.  You
need to think most about support, there are several companies that can sell you support if
you want/need it.  You also need to think about features vs. stability.  The 0.20.203 release
has been tested on a lot of machines by many different groups, but may be missing some features
that are needed in some situations.


On 7/14/11 11:49 PM, "Adarsh Sharma" <adarsh.sharma@orkash.com> wrote:

Hadoop releases are issued time by time. But one more thing related to
hadoop usage,

There are so many providers that provides the distribution of Hadoop ;

1. Apache Hadoop
2. Cloudera
3. Yahoo

Which distribution is best among them on production usage.
I think Cloudera's  is best among them.

Best Regards,
Owen O'Malley wrote:
> On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote:
>> I'm a newbie and I am confused by the Hadoop releases.
>> I thought 0.21.0 is the latest & greatest release that I
>> should be using but I noticed 0.20.203 has been released
>> lately, and 0.21.X is marked "unstable, unsupported".
>> Should I be using 0.20.203?
> Yes, I apologize for confusing release numbering, but the best release to use is
It includes security, job limits, and many other improvements over 0.20.2 and 0.21.0. Unfortunately,
it doesn't have the new sync support so it isn't suitable for using with HBase. Most large
clusters use a separate version of HDFS for HBase.
> -- Owen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message