incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <omal...@apache.org>
Subject Re: [VOTE] Accept Tez into Incubator
Date Thu, 21 Feb 2013 15:52:37 GMT
+1 (binding)


On Wed, Feb 20, 2013 at 10:07 PM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> +1 (binding)
>
> Thanks!
>
> Cheers,
> Chris
>
> On 2/19/13 8:26 PM, "Arun C Murthy" <acm@hortonworks.com> wrote:
>
> >Hi Folks,
> >
> >Thanks for participating in the discussion. I'd like to call a VOTE for
> >acceptance of Apache Tez into the Incubator. I'll let the vote run till
> >into this weekend (Sun 2/24 6pm PST).
> >
> >[ ]  +1 Accept Apache Tez into the Incubator
> >[ ]  +0 Don't care.
> >[ ]  -1 Don't accept Apache Tez into the Incubator because...
> >
> >Full proposal is pasted at the bottom of this email, and the
> >corresponding wiki is http://wiki.apache.org/incubator/TezProposal.
> >
> >Only VOTEs from Incubator PMC members are binding, but all are welcome to
> >express their thoughts.
> >
> >Here's my +1 (binding).
> >
> >thanks,
> >Arun
> >
> >PS: From the initial discussion, the only changes are that I've added one
> >new mentor and 2 new committers. All the new additions come from the
> >non-major employer while we continue to strive to further diversify
> >during the incubation. Thanks.
> >
> >----
> >
> >= Tez =
> >
> >== Abstract ==
> >Tez is an effort to develop a generic application framework which can be
> >used
> >to process arbitrarily complex data-processing tasks and also a re-usable
> >set
> >of data-processing primitives which can be used by other projects.
> >
> >== Proposal ==
> >Tez is a proposal to develop a generic application which can be used to
> >process complex data-processing task DAGs and runs natively on Apache
> >Hadoop
> >YARN. YARN is a generic resource-management system on which currently
> >applications like MapReduce already exist. MapReduce is a specific, and
> >constrained, DAG - which is not optimal for several frameworks like
> >Apache Hive
> >and Apache Pig. Furthermore, we propose to develop a re-usable set of
> >libraries of data-processing primitives such as sorting, merging,
> >data-shuffling, intermediate data management etc. which are necessary for
> >Tez
> >which we envision can be used directly by other projects.
> >
> >== Background ==
> >Apache Hadoop MapReduce has emerged as the assembly-language on which
> >other
> >frameworks like Apache Pig and Apache Hive have been built. However, it
> >has
> >been well accepted that MapReduce produces very constrained task DAGs for
> >each
> >job which results in Apache Pig and Apache Hive requiring multiple
> >MapReduce
> >jobs for several queries. By providing a more expressive DAG of tasks for
> >a
> >job, Tez attempts to provide significantly enhanced data-processing
> >capabilities for projects like Apache Pig, Apache Hive, Cascading etc.
> >
> >== Rationale ==
> >There is an important gap that Tez fulfills in the Apache Hadoop
> >ecosystem of
> >allowing for more expressive task DAGs for data-processing applications
> >such
> >as Apache Pig, Apache Hive, Cascading etc.
> >
> >With emergence of Apache Hadoop YARN, there is a strong need for a
> >common DAG application which can then be shared by Apache Pig, Apache
> >Hive,
> >Cascading etc.
> >
> >== Initial Goals ==
> >The initial goals for this project are to specify the detailed
> >requirements
> >and architecture, and then develop the initial implementation including
> >the
> >DAG ApplicationMaster to run natively inside Apache Hadoop YARN.
> >
> >== Current Status ==
> >Significant work has been completed to identify the initial requirements
> >and
> >define the overall system architecture. There is a patch available in the
> >internal Hortonworks git repository which can act as the initial seed.
> >
> >=== Meritocracy ===
> >We plan to invest in supporting a meritocracy. We will discuss the
> >requirements
> >in an open forum. Several companies have already expressed interest in
> >this
> >project, and we intend to invite additional developers to participate.
> >We will encourage and monitor community participation so that privileges
> >can be
> >extended to those that contribute.
> >
> >=== Community ===
> >The need for a generic DAG application for data processing in the open
> >source is
> >tremendous, so there is a potential for a very large community. We believe
> >that Tez's extensible architecture will further encourage community
> >participation.
> >Also, related Apache projects (eg, Pig, Hive) have very large and active
> >communities, and we expect that over time Tez will also attract a large
> >community.
> >
> >=== Core Developers ===
> >The developers on the initial committers list include people very
> >experienced
> >in the Apache Hadoop ecosystem:
> >
> > * Alan Gates <gates at apache dot org>
> > * Arun C Murthy <acmurthy at apache dot org>
> > * Ashutosh Chauhan <hashutosh at apache dot org>
> > * Bikas Saha <bikas at apache dot org>
> > * Chris Douglas <cdouglas at apache dot org>
> > * Daryn Sharp <daryn at apache dot org>
> > * Devaraj Das <ddas at apache dot org>
> > * Gopal Vijayaraghavan <gopal at hortonworks dot com>
> > * Gunther Hagleitner <ghagleitner at hortonworks dot com>
> > * Hitesh Shah <hitesh at apache dot org>
> > * Jason Lowe <jlowe at apache dot org>
> > * Jean Xu <jeanxu at facebook dot com>
> > * Jitendra Pandey <jitendra at apache dot org>
> > * Julien Le Dem <julien at apache dot org>
> > * Kevin Wilfong <kevinwilfong at apache dot org>
> > * Mike Liddell <mike dot lidell at microsoft dot com>
> > * Namit Jain <namit at apache dot org>
> > * Nathan Roberts <nroberts at yahoo dash inc dot com>
> > * Owen O'Malley <omalley at apache dot org>
> > * Robert Evans <bobby at apache dot org>
> > * Siddharth Seth <sseth at apache dot org>
> > * Tom White <tomwhite at apache dot org>
> > * Thomas Graves <tgraves at apache dot org>
> > * Vikram Dixit <vikram at apache dot org>
> > * Vinod Kumar Vavilapalli <vinodkv at apache dot org>
> > * William Graham <billgraham at apache dot org>
> >
> >We realize that though we have significant employer diversity already,
> >additional diversity is always better, and we will work
> >aggressively to recruit developers from additional companies.
> >
> >=== Alignment ===
> >The initial committers strongly believe that a standard task DAG
> >application on Apache Hadoop YARN will gain broader adoption as an open
> >source,
> >community driven project, where the community can contribute not only to
> >the
> >core components, but also to a growing collection of applications which
> >will
> >be based on top of Tez. Our hope is that the Apache Hive, Apache Pig,
> >Cascading and other communities will find tremendous value in Tez and
> >will adopt
> >it en masse.
> >
> >== Known Risks ==
> >
> >=== Orphaned Products ===
> >The contributors are leading users and vendors in the Apache Hadoop
> >ecosystem,
> >with significant open source experience, so the risk of being orphaned is
> >relatively low. The project could be at risk if vendors decided to change
> >their strategies in the market. In such an event, the current committers
> >plan to continue working on the project on their own time, though the
> >progress will likely be slower. We plan to mitigate this risk by
> >recruiting additional committers.
> >
> >=== Inexperience with Open Source ===
> >The initial committers include veteran Apache members (Committers, PMC
> >members
> >and Apache Members) and other developers who have varying degrees of
> >experience
> >with open source projects. All have been involved with source code that
> >has
> >been released under an open source license, and several also have
> >experience
> >developing code with an open source development process.
> >
> >=== Homogenous Developers ===
> >The initial committers are employed by a number of companies, including
> >Cloudera, Facebook, Hortonworks, Microsoft, Twitter and Yahoo. We are
> >committed
> >to recruiting additional committers from other companies based on their
> >contributions to the project even though we do have significant diversity
> >already.
> >
> >=== Reliance on Salaried Developers ===
> >It is expected that Tez development will occur on both salaried time and
> >on
> >volunteer time, after hours. The majority of initial committers are paid
> >by
> >their employer to contribute to this project. However, they are all
> >passionate
> >about the project, and we are confident that the project will continue
> >even if
> >no salaried developers contribute to the project. We are committed to
> >recruiting
> >additional committers including non-salaried developers.
> >
> >=== Relationships with Other Apache Products ===
> >As mentioned in the Alignment section, Tez is closely integrated with
> >Hadoop,
> >Hive and Pig in a numerous ways. We look forward to collaborating with
> >those communities, as well as other Apache communities.
> >
> >=== An Excessive Fascination with the Apache Brand ===
> >Tez solves a real need for generic task DAG management in the Apache
> >Hadoop
> >ecosystem, something which has been addressed in a very ad hoc manner so
> >far
> >by multiple Apache projects. Our rationale for developing Tez as an
> >Apache
> >project is detailed in the Rationale section. We believe that the Apache
> >brand
> >and community process will help us attract more contributors to this
> >project,
> >and help establish ubiquitous APIs.
> >
> >== Documentation ==
> >http://wiki.apache.org/incubator/TezProposal
> >
> >== Initial Source ==
> >Available as a patch.
> >
> >== Cryptography ==
> >Tez will eventually support encryption on the wire. This is not one of
> >the initial
> >goals, and we do not expect Tez to be a controlled export item due to the
> >use
> >of encryption.
> >
> >== Required Resources ==
> >
> >=== Mailing List ===
> > * tez-private
> > * tez-dev
> > * tez-user
> >
> >=== Subversion Directory ===
> >Git is the preferred source control system: git://git.apache.org/tez
> >
> >=== Issue Tracking ===
> >
> >JIRA Tez (TEZ)
> >
> >== Initial Committers ==
> > * Alan Gates <gates at apache dot org>
> > * Arun C Murthy <acmurthy at apache dot org>
> > * Ashutosh Chauhan <hashutosh at apache dot org>
> > * Bikas Saha <bikas at apache dot org>
> > * Chris Douglas <cdouglas at apache dot org>
> > * Daryn Sharp <daryn at apache dot org>
> > * Devaraj Das <ddas at apache dot org>
> > * Gopal Vijayaraghavan <gopal at hortonworks dot com>
> > * Gunther Hagleitner <ghagleitner at hortonworks dot com>
> > * Hitesh Shah <hitesh at apache dot org>
> > * Jason Lowe <jlowe at apache dot org>
> > * Jean Xu <jeanxu at facebook dot com>
> > * Jitendra Pandey <jitendra at apache dot org>
> > * Julien Le Dem <julien at apache dot org>
> > * Kevin Wilfong <kevinwilfong at apache dot org>
> > * Mike Liddell <mike dot lidell at microsoft dot com>
> > * Namit Jain <namit at apache dot org>
> > * Nathan Roberts <nroberts at yahoo dash inc dot com>
> > * Owen O'Malley <omalley at apache dot org>
> > * Robert Evans <bobby at apache dot org>
> > * Siddharth Seth <sseth at apache dot org>
> > * Tom White <tomwhite at apache dot org>
> > * Thomas Graves <tgraves at apache dot org>
> > * Vikram Dixit <vikram at apache dot org>
> > * Vinod Kumar Vavilapalli <vinodkv at apache dot org>
> > * William Graham <billgraham at apache dot org>
> >
> >== Affiliations ==
> >The initial committers are employees of Cloudera, Facebook, Hortonworks,
> >Microsoft, Twitter and Yahoo Inc.
> >
> > * Alan Gates - Hortonworks
> > * Arun C Murthy - Hortonworks
> > * Ashutosh Chauhan - Hortonworks
> > * Bikas Saha - Hortonworks
> > * Chris Douglas - Microsoft
> > * Daryn Sharp - Yahoo
> > * Devaraj Das - Hortonworks
> > * Gopal Vijayaraghavan - Hortonworks
> > * Gunther Hagleitner - Hortonworks
> > * Hitesh Shah - Hortonworks
> > * Jason Lowe - Yahoo
> > * Jean Xu - Facebook
> > * Jitendra Pandey - Hortonworks
> > * Julien Le Dem - Twitter
> > * Kevin Wilfong - Facebook
> > * Mike Liddell - Microsoft
> > * Namit Jain - Facebook
> > * Nathan Roberts - Yahoo
> > * Owen O'Malley - Hortonworks
> > * Robert Evans - Yahoo
> > * Siddharth Seth - Hortonworks
> > * Tom White - Cloudera
> > * Thomas Graves - Yahoo
> > * Vikram Dixit - Hortonworks
> > * Vinod Kumar Vavilapalli - Hortonworks
> > * William Graham - Twitter
> >
> >The nominated mentors are employees of Hortonworks, LinkedIn,
> >NASA JPL and Microsoft.
> >
> > * Alan Gates - Hortonworks
> > * Arun C Murthy - Hortonworks
> > * Chris Douglas - Microsoft
> > * Chris Mattman - NASA JPL
> > * Jakob Homan - LinkedIn
> > * Owen O'Malley - Hortonworks
> >
> >== Sponsors ==
> >
> >=== Champion ===
> >Arun C Murthy <acmurthy at apache dot org>
> >
> >=== Nominated Mentors ===
> > * Alan Gates <gates at apache dot org> – Architect at Hortonworks.
> >Committer for Pig.
> > * Arun C Murthy <acmurthy at apache dot org> – Architect at
> >Hortonworks. Committer for Hadoop.
> > * Chris Douglas <cdouglas at apache dot org> - Sr. Research Engineer at
> >Microsoft. Committer for Hadoop.
> > * Chris Mattman <mattmann at apache dot org> - Sr. Computer Scientist,
> >NASA JPL. Committer for Nutch, OODT and Tika.
> > * Jakob Homan <jghoman at apache dot org> – Sr. Software Engineer,
> >LinkedIn. Committer for Hadoop, Kafka, Giraph.
> > * Owen O'Malley <omalley at apache dot org> – Architect at
> >Hortonworks. Committer for Hadoop, Ambari.
> >
> >=== Sponsoring Entity ===
> >Incubator
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message