hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Srinivas <sur...@hortonworks.com>
Subject Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project
Date Wed, 29 Aug 2012 17:02:56 GMT
I am +1 for splitting up the projects. This is the step in the right
direction. There will be challenges along the way. I am confident we can
solve them.

Robert and Alejandro have brought up good questions. Here are my thoughts:
- For first one or two releases all the projects can coordinate and do the
releases together. This should help simplify the immediate work needed.
This should also help in us meeting the release timelines that we are
working towards. As the split makes progress, this cross project
coordination will no longer be necessary. I volunteer to RM these releases
and do the needed co-ordination from HDFS.
- As regards to APIs, currently we have LimitedPrivate APIs for related
projects. This has been used by HBase as well. We need to think about a
timeline by when we can mark these APIs stable. They should remain
LimitedPrivate. Any rare changes to APIs requires only co-ordination among
the projects and no user applications (which we have not control over) is
affected.
- I agree with Arun that the common can move with HDFS.

Regards,
Suresh

On Wed, Aug 29, 2012 at 9:31 AM, Arun C Murthy <acm@hortonworks.com> wrote:

>
> On Aug 28, 2012, at 8:50 PM, Alejandro Abdelnur wrote:
>
> > Chris, thanks for initiating the discussion.
>
> Likewise, thanks Chris!
>
> >
> > IMO a pre-requisite to this is to figure out how we'll handle the
> following:
> >
>
>
> Good points - I'd recommend we keep Common and HDFS in the same project.
> Yes, MR/YARN will need some changes in Common occasionally, but core pieces
> like RPC have been maintained by HDFS folks over time anyway e.g. move to
> ProtoBufs were led by Sanjay, Suresh, Todd, Jitendra et al.
>
> We can move SequenceFile into MR if necessary and keep same package names
> for compatibility.
>
> We should, of course, stop tweaking things in different projects in the
> same jira - we've been reasonably good at not doing that.
>
> Thoughts?
>
> Arun
>
> > * Where does common stuff lives?
> > * What are the public interfaces of each project (towards the other
> projects)?
> > * How do we do development/releases? In tandem? Separate? How this
> > will work in practice, currently we are constantly tweaking things
> > inter-projects, sometimes in the same JIRAs, sometimes in follow up
> > JIRAs.
> >
> > Thoughts?
> >
> > Thxs.
> >
> > On Tue, Aug 28, 2012 at 7:33 PM, Mattmann, Chris A (388J)
> > <chris.a.mattmann@jpl.nasa.gov> wrote:
> >> [decided to minimize traffic and to simply put this in one thread]
> >>
> >> Hi Guys,
> >>
> >> See the recent discussion on these threads:
> >>
> >> YARN as its own Hadoop "sub project": http://s.apache.org/WW1
> >> Maintain a single committer list for the Hadoop project:
> http://s.apache.org/Owx
> >>
> >> ...and just pay attention to the Hadoop project over the last 3-4
> years. It's operating
> >> as a single project, that's masking separate communities that
> themselves are really
> >> separate ASF projects.
> >>
> >> At the ASF, this has been a problem area called "umbrella" projects and
> over the years,
> >> all I've seen from them is wasted bandwidth, artificial barriers and
> the inventions of
> >> new ways to perform process mongering and to reduce the fun in
> developing software
> >> at this fantastic foundation.
> >>
> >> I've talked about umbrella projects enough. We've diverted conversation
> enough.
> >> Enough people have tried to act like there is some technical mumbo
> jumbo that is
> >> preventing the eventual act of higher power that I myself hope comes
> should these
> >> discussions prove unfruitful through normal means.
> >>
> >> *these. are. separate. projects.*
> >>
> *there.are.not.blocker.issues.from.spinning.out.these.projects.as.their.own.communities*
> >>
> >> In this email: http://s.apache.org/rSm
> >>
> >> And in the 2 subsequent follow ons in that thread, I've outlined a
> process that I'll copy
> >> through below for splitting these projects into their own TLPs:
> >>
> >> -----snip
> >> Process:
> >>
> >> 0. [DISCUSS] thread for <TLP name> in which you talk about #1 and #2
> below, potentially draft resolution too.
> >>
> >> 1. Decide on an initial set of *PMC* members. I urge each new TLP to
> adopt PMC==C. See reasons I've
> >> already discussed.
> >>
> >> 2. Decide on a chair. Try not to VOTE for this explicitly, see if can
> be discussed and consensus
> >> can be reached (just a thought experiment). VOTE if necessary.
> >>
> >> 3. [VOTE] thread for <TLP name>
> >>
> >> 4. Create Project:
> >>  a. paste resolution from #0 to board@ or;
> >>  b. go to general@incubator and start new Incubator project.
> >>
> >> 5. infrastructure set up.
> >>   MLs moving; new UNIX groups; website setup;
> >>   SVN setup like this:
> >>
> >> svn copy -m "MR TLP." https://svn.apache.org/repos/asf/hadoop/
> https://svn.apache.org/repos/asf/<insert cool MR name>; or
> >> svn copy -m "YARN TLP." https://svn.apache.org/repos/asf/hadoop/
> https://svn.apache.org/repos/asf/<insert cool YARN name>; or
> >> svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/
> https://svn.apache.org/repos/asf/<insert cool HDFS name>
> >>
> >> After all 3 have been created run:
> >>
> >> svn remove -m "Remove Hadoop umbrella TLP. Split into separate
> projects." https://svn.apache.org/repos/asf/hadoop
> >>
> >> 6. (TLPs if 4a; Incubator podling if 4b;) proceed, collaborate, operate
> as distinct communities, and try to solve the code duplication/dependency
> >> issues from there.
> >>
> >> 7. If 4b; then graduate as TLP from Incubator.
> >>
> >> -----snip
> >>
> >> So that's my proposal.
> >>
> >> Thanks guys.
> >>
> >> Cheers,
> >> Chris
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Senior Computer Scientist
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 171-266B, Mailstop: 171-246
> >> Email: chris.a.mattmann@nasa.gov
> >> WWW:   http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Assistant Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >
> >
> >
> > --
> > Alejandro
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 
http://hortonworks.com/download/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message