Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C5CFDBE3 for ; Fri, 31 Aug 2012 01:24:04 +0000 (UTC) Received: (qmail 95263 invoked by uid 500); 31 Aug 2012 01:24:02 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 95151 invoked by uid 500); 31 Aug 2012 01:24:02 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 95142 invoked by uid 99); 31 Aug 2012 01:24:02 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 01:24:02 +0000 Received: from localhost (HELO mail-ob0-f176.google.com) (127.0.0.1) (smtp-auth username cdouglas, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 01:24:02 +0000 Received: by obbtb18 with SMTP id tb18so5909808obb.35 for ; Thu, 30 Aug 2012 18:24:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.60.8.71 with SMTP id p7mr6569345oea.56.1346376241557; Thu, 30 Aug 2012 18:24:01 -0700 (PDT) Received: by 10.182.213.69 with HTTP; Thu, 30 Aug 2012 18:24:01 -0700 (PDT) In-Reply-To: <0C112F45-3130-4093-B0F2-2D1F02B8C84C@jpl.nasa.gov> References: <0C112F45-3130-4093-B0F2-2D1F02B8C84C@jpl.nasa.gov> Date: Thu, 30 Aug 2012 18:24:01 -0700 Message-ID: Subject: Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project From: Chris Douglas To: general@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 +1 for splitting the projects +1 for adding all MR contributors to Yarn I may have missed its mention in this thread, but maintaining the 1.x branch is probably the most awkward technical hurdle. I'm not sure how that should be managed if the projects are split. In one strategy, it can be left with Common+HDFS until 2.x stabilizes. The tasks that are simpler in a unified project- releases, cross-project patches, etc- are relatively rare, but all dev has paid a tax. That acknowledged, as Arun points out: the half-measures that have made the split painful can be fixed and enthusiasm/resources appear to be available for that. As long as TLPs are reconciled quickly and decisively, this can be successful. Without dedicated resources, we can expect the same result as before. As for what this accomplishes: each subproject is more approachable on its own. I don't think it will alleviate political tensions, neither are such tensions inherently unhealthy. But a split can limit the scope to the particular subproject and its interests. It's also easier for collaborators to engage the subset of contributors charged with its roadmap: Pig/Hive should be able to wrangle MapReduce and Yarn folks on their dev list, as HBase should engage HDFS without importing extra context. As another practical matter: we should change the bylaws so emeritus PMC members/committers can reinstate themselves without a vote. I expect many people, including myself, would have no problem signaling periods of inactivity if project politics were out of the equation. -C On Tue, Aug 28, 2012 at 7:33 PM, Mattmann, Chris A (388J) wrote: > [decided to minimize traffic and to simply put this in one thread] > > Hi Guys, > > See the recent discussion on these threads: > > YARN as its own Hadoop "sub project": http://s.apache.org/WW1 > Maintain a single committer list for the Hadoop project: http://s.apache.org/Owx > > ...and just pay attention to the Hadoop project over the last 3-4 years. It's operating > as a single project, that's masking separate communities that themselves are really > separate ASF projects. > > At the ASF, this has been a problem area called "umbrella" projects and over the years, > all I've seen from them is wasted bandwidth, artificial barriers and the inventions of > new ways to perform process mongering and to reduce the fun in developing software > at this fantastic foundation. > > I've talked about umbrella projects enough. We've diverted conversation enough. > Enough people have tried to act like there is some technical mumbo jumbo that is > preventing the eventual act of higher power that I myself hope comes should these > discussions prove unfruitful through normal means. > > *these. are. separate. projects.* > *there.are.not.blocker.issues.from.spinning.out.these.projects.as.their.own.communities* > > In this email: http://s.apache.org/rSm > > And in the 2 subsequent follow ons in that thread, I've outlined a process that I'll copy > through below for splitting these projects into their own TLPs: > > -----snip > Process: > > 0. [DISCUSS] thread for in which you talk about #1 and #2 below, potentially draft resolution too. > > 1. Decide on an initial set of *PMC* members. I urge each new TLP to adopt PMC==C. See reasons I've > already discussed. > > 2. Decide on a chair. Try not to VOTE for this explicitly, see if can be discussed and consensus > can be reached (just a thought experiment). VOTE if necessary. > > 3. [VOTE] thread for > > 4. Create Project: > a. paste resolution from #0 to board@ or; > b. go to general@incubator and start new Incubator project. > > 5. infrastructure set up. > MLs moving; new UNIX groups; website setup; > SVN setup like this: > > svn copy -m "MR TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/; or > svn copy -m "YARN TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/; or > svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/ > > After all 3 have been created run: > > svn remove -m "Remove Hadoop umbrella TLP. Split into separate projects." https://svn.apache.org/repos/asf/hadoop > > 6. (TLPs if 4a; Incubator podling if 4b;) proceed, collaborate, operate as distinct communities, and try to solve the code duplication/dependency > issues from there. > > 7. If 4b; then graduate as TLP from Incubator. > > -----snip > > So that's my proposal. > > Thanks guys. > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattmann@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >