hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hammerbacher <ham...@cloudera.com>
Subject Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"
Date Tue, 01 Feb 2011 03:44:02 GMT
Excellent news! Will you also make Howl, Oozie, and Yarn Apache projects as
well?

On Mon, Jan 31, 2011 at 7:27 PM, Eric Baldeschwieler
<eric14@yahoo-inc.com>wrote:

> Hi Folks,
>
> I'm pleased to announce that after some reflection, Yahoo! has decided to
> discontinue the  "The Yahoo Distribution of Hadoop" and focus on Apache
> Hadoop.  We plan to remove all references to a Yahoo distribution from our
> website (developer.yahoo.com/hadoop), close our github repo (
> yahoo.github.com/hadoop-common) and focus on working more closely with the
> Apache community.  Our intent is to return to helping Apache produce binary
> releases of Apache Hadoop that are so bullet proof that Yahoo and other
> production Hadoop users can run them unpatched on their clusters.
>
> Until Hadoop 0.20, Yahoo committers worked as release masters to produce
> binary Apache Hadoop releases that the entire community used on their
> clusters.    As the community grew, we have experiment with using the
> "Yahoo! Distribution of Hadoop" as the vehicle to share our work.
>  Unfortunately, Apache is no longer the obvious place to go for Hadoop
> releases.  The Yahoo! team wants to return to a world where anyone can
> download and directly use releases of Hadoop from Apache.  We want to
> contribute to the stabilization and testing of those releases.  We also want
> to share our regular program of sustaining engineering that backports minor
> feature enhancements into new dot releases on a regular basis, so that the
> world sees regular improvements coming from Apache every few months, not
> years.
>
> Recently the Apache Hadoop community has been very turbulent.  Over the
> last few months we have been developing Hadoop enhancements in our internal
> git repository while doing a complete review of our options. Our commitment
> to open sourcing our work was never in doubt (see http://yhoo.it/e8p3Dd),
> but the future of the "Yahoo distribution of Hadoop" was far from clear.
>  We've concluded that focusing on Apache Hadoop is the way forward.  We
> believe that more focus on communicating our goals to the Apache Hadoop
> community, and more willingness to compromise on how we get to those goals,
> will help us get back to making Hadoop even better.
>
> Unfortunately, we now have to sort out how to contribute several
> person-years worth of work to Apache to let us unwind the Yahoo! git
> repositories.  We currently run two lines of Hadoop development, our
> sustaining program (hadoop-0.20-sustaining) and hadoop-future.
>  Hadoop-0.20-sustaining is the stable version of Hadoop we currently run on
> Yahoo's 40,000 nodes.  It contains a series of fixes and enhancements that
> are all backwards compatible with our "Hadoop 0.20 with security".  It is
> our most stable and high performance release of Hadoop ever.  We've expended
> a lot of energy finding and fixing bugs in it this year. We have initiated
> the process of contributing this work to Apache in the branch:
> hadoop/common/branches/branch-0.20-security.  We've proposed calling this
> the 20.100 release.  Once folks have had a chance to try this out and we've
> had a chance to respond to their feedback, we plan to create 20.100 release
> candidates and ask the community to vote on making them Apache releases.
>
> Hadoop-future is our new feature branch.  We are working on a set of new
> features for Hadoop to improve its availability, scalability and
> interoperability to make Hadoop more usable in mission critical deployments.
> You're going to see another burst of email activity from us as we work to
> get hadoop-future patches socialized, reviewed and checked in.  These bulk
> checkins are exceptional.  They are the result of us striving to be more
> transparent.  Once we've merged our hadoop-future and hadoop-0.20-sustaining
> work back into Apache, folks can expect us to return to our regular
> development cadence.  Looking forward, we plan to socialize our roadmaps
> regularly, actively synchronize our work with other active Hadoop
> contributors and develop our code collaboratively, directly in Apache.
>
> In summary, our decision to discontinue the "Yahoo! Distribution of Hadoop"
> is a commitment to working more effectively with the Apache Hadoop
> community.  Our goal is to make Apache Hadoop THE open source platform for
> big data.
>
> Thanks,
>
> E14
>
> --
>
> PS Here is a draft list of key features in hadoop-future:
>
> * HDFS-1052 - Federation, the ability to support much more storage per
> Hadoop cluster.
>
> * HADOOP-6728 - A the new metrics framework
>
> * MAPREDUCE-1220 - Optimizations for small jobs
>
> ---
> PPS This is cross-posted on our blog: http://yhoo.it/i9Ww8W

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message