hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project
Date Thu, 30 Aug 2012 16:17:46 GMT
On Wed, Aug 29, 2012 at 7:29 PM, Mattmann, Chris A (388J)
<chris.a.mattmann@jpl.nasa.gov> wrote:
> You're right, it's not project boundaries, it's poor community behavior,
> and general umbrella-project-ness.

The primary problem I see with umbrellas is that the PMC isn't able to
accurately represent the developer community.  Hadoop used to have
that problem, when HBase, etc. were subprojects and most PMC members
were not involved in those subprojects.  Currently this is less of a
problem.  Many PMC members are involved in several different parts of
the project and most PMC members follow all the developer mailing
lists.  Hadoop at present thus has some semblance to an umbrella but
is by no means a classic umbrella.

> One aspect I've seen is that exclusivity of allowing people to become
> PMC members on the project, and the separation of PMC from C.
> Other things I've seen are the use of technical justifications or complexity
> issues as an excuse for the exclusivity, as an excuse for drawing boundaries
> between project committers and PMC members, and then between specific
> products that the project and community as a whole releases, and finally
> other things I've seen include external interests influencing the way that
> business is done around here (need for releases in downstream companies,
> or projects driving upstream, Apache decisions, which are supposed to be
> independent of any lone company, or set of companies -- it's individuals here
> that do the work).

I am unconvinced that splitting Hadoop into three projects is a
panacea for these issues.  For example, adding committers to the
sub-lists has been contentious even among the members of those

Splitting is perhaps a better long-term structure for the project.
But it should be done slowly and carefully.  Moving too quickly could
cause a lot of extra work for a lot of people, both in the project and
downstream.  A series of incremental steps should prove less painful.
For example, the YARN developers might propose that they fork to a new
TLP.  The YARN code code could then be removed from the mother
project's trunk but remain in branches for compatible bugfix releases.
 Downstream projects could start adding a dependency on the YARN
project once it makes releases.


View raw message