hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Hadoop versions & distributions
Date Mon, 05 Jul 2010 21:20:32 GMT
On Mon, Jul 5, 2010 at 1:12 AM, Evert Lammerts <Evert.Lammerts@sara.nl>wrote:

>  There are a number of different versions and distributions of Hadoop
> which, as far as I understand, all differ from each other. I know that in
> the 0.20-append branch, files in HDFS can be appended, and that the Y!
> distribution (0.20.S) implements security features through Kerberos. And
> then there are the 0.20.3 and 0.22.0 branches. And trunk of course, which I
> guess is 0.20.2 nowadays? In addition to that there are distributions by
> Cloudera(CDH2 / 3beta) and IBM (IDAH).
> From my perspective, setting up a pilot cluster for a small number of users
> from different institutes, security (0.20.S) is very attractive – scientists
> like the idea of shielding their data and logic from other users. But what
> will I miss if I choose Y!’s distribution over all of these other options?
Hi Evert,

Y!'s distribution does contain a good set of patches, and we at Cloudera are
always keeping track of the ydist git repository to incorporate those
changes into CDH. Currently, ydist contains the security patch series, but
doesn't include the recent append work. CDH3b2 includes the append work, but
not security as of yet -- we are currently integrating security and it
should be available in the next beta.

Aside from the specific patches included, it's worth noting that the Y! dist
is a git repository, rather than a full binary-and-source distribution of
Hadoop and related tools. CDH includes not just the core hadoop components
but also integrates many other important ecosystem components including Pig,
Hive, Oozie, HBase, Zookeeper, Flume, etc.


Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message