Return-Path: X-Original-To: apmail-hadoop-general-archive@minotaur.apache.org Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4C85FD862 for ; Wed, 29 Aug 2012 18:41:57 +0000 (UTC) Received: (qmail 57234 invoked by uid 500); 29 Aug 2012 18:41:55 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 57158 invoked by uid 500); 29 Aug 2012 18:41:55 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 57150 invoked by uid 99); 29 Aug 2012 18:41:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 18:41:55 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eli@cloudera.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-ee0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 18:41:51 +0000 Received: by eekd41 with SMTP id d41so417401eek.35 for ; Wed, 29 Aug 2012 11:41:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=nzzIvoY9zb2aF27CIHNGdK7psVZzgy0yGFp3P4LbG6w=; b=FFCMstnixCnNUyWYdbN7uK1bZDoHDXbE/4+wjV1j6NxXaEe8ml8/+5WXqP6M07nEe4 vs7sK2lJaSH4ent1KYztyOQBvrRQ5mp9IBkqhiIbd7bHMNPeyBOUbilSLqY+DrenDdaz 9H1OYAtOM1yHfOl7Zy0xNLfwoTqOLwVYglnkuvx6SfWDEf4dPN2EBXRerZRvDnIuVodr 8C2tX8pt0KFgFeAURRFGRN8fHcSVlzmjFhUC4rdiYGIIRN3W7BsUbyUVjM3jilRqONLI IEgpQKy9TC8u+kG8vOkiUc10VEWViD6HVD0NqZzCjAvDnBzo10x5+KplCLH5HpsLP5Rj HkXg== MIME-Version: 1.0 Received: by 10.14.221.197 with SMTP id r45mr3172339eep.41.1346265690001; Wed, 29 Aug 2012 11:41:30 -0700 (PDT) Received: by 10.14.48.7 with HTTP; Wed, 29 Aug 2012 11:41:29 -0700 (PDT) In-Reply-To: <0C112F45-3130-4093-B0F2-2D1F02B8C84C@jpl.nasa.gov> References: <0C112F45-3130-4093-B0F2-2D1F02B8C84C@jpl.nasa.gov> Date: Wed, 29 Aug 2012 11:41:29 -0700 Message-ID: Subject: Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project From: Eli Collins To: general@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQk00LzedmnyMsMXOL/VUrW3VIWGtwqrCAyvCZXIKZA/okpqZ8yRAWQ5FeY56hT02ABv9FYc X-Virus-Checked: Checked by ClamAV on apache.org Thanks for writing up a proposal Chris. I think it makes sense to have Common live in HDFS at least for now, since it's at the bottom of the stack / dependency chain and it's code is the most intertwined with common, and, per Arun, we tend to work on common stuff more than MR people. The HDFS project is really a lot more than HDFS, eg has all the hadoop commands, non-HDFS file system source, etc but that seems like an OK starting point. We need to figure out the committers and PMC though since the goal is to just have the HDFS community (vs the current Hadoop people) but the project will contain non-HDFS stuff. I'd like to hear from the current Hadoop committers and PMC members that mostly work on MR and YARN - are you guys OK losing your current privileges on the HDFS repo? Otherwise we haven't made much progress (ie HDFS still has multiple communities). We also need to address the areas where it's not so cut and dry, eg where there is a single Hadoop project: - The Hadoop trademark, assume this lives in the HDFS project if Common does? - The user community, eg the users lists that we *just* merged, shall we still keep one list? - We should move the global stuff like "how to get started" docs to Bigtop, which can point to individual projects resources - Hadoop 1.x is is maintenance mode, though it still actively gets patches so we need to consider it. The surgery necessary to split v1 Hadoop is probably not suitable for a sustaining release and not worth it at this point in the lifetime of this branch. I assume the HDFS project will then host the Hadoop 1.x branches? This implies only members of the HDFS project can commit and release. Thanks, Eli On Tue, Aug 28, 2012 at 7:33 PM, Mattmann, Chris A (388J) wrote: > [decided to minimize traffic and to simply put this in one thread] > > Hi Guys, > > See the recent discussion on these threads: > > YARN as its own Hadoop "sub project": http://s.apache.org/WW1 > Maintain a single committer list for the Hadoop project: http://s.apache.org/Owx > > ...and just pay attention to the Hadoop project over the last 3-4 years. It's operating > as a single project, that's masking separate communities that themselves are really > separate ASF projects. > > At the ASF, this has been a problem area called "umbrella" projects and over the years, > all I've seen from them is wasted bandwidth, artificial barriers and the inventions of > new ways to perform process mongering and to reduce the fun in developing software > at this fantastic foundation. > > I've talked about umbrella projects enough. We've diverted conversation enough. > Enough people have tried to act like there is some technical mumbo jumbo that is > preventing the eventual act of higher power that I myself hope comes should these > discussions prove unfruitful through normal means. > > *these. are. separate. projects.* > *there.are.not.blocker.issues.from.spinning.out.these.projects.as.their.own.communities* > > In this email: http://s.apache.org/rSm > > And in the 2 subsequent follow ons in that thread, I've outlined a process that I'll copy > through below for splitting these projects into their own TLPs: > > -----snip > Process: > > 0. [DISCUSS] thread for in which you talk about #1 and #2 below, potentially draft resolution too. > > 1. Decide on an initial set of *PMC* members. I urge each new TLP to adopt PMC==C. See reasons I've > already discussed. > > 2. Decide on a chair. Try not to VOTE for this explicitly, see if can be discussed and consensus > can be reached (just a thought experiment). VOTE if necessary. > > 3. [VOTE] thread for > > 4. Create Project: > a. paste resolution from #0 to board@ or; > b. go to general@incubator and start new Incubator project. > > 5. infrastructure set up. > MLs moving; new UNIX groups; website setup; > SVN setup like this: > > svn copy -m "MR TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/; or > svn copy -m "YARN TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/; or > svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/ > > After all 3 have been created run: > > svn remove -m "Remove Hadoop umbrella TLP. Split into separate projects." https://svn.apache.org/repos/asf/hadoop > > 6. (TLPs if 4a; Incubator podling if 4b;) proceed, collaborate, operate as distinct communities, and try to solve the code duplication/dependency > issues from there. > > 7. If 4b; then graduate as TLP from Incubator. > > -----snip > > So that's my proposal. > > Thanks guys. > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattmann@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >