incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: [VOTE] Accept Tez into Incubator
Date Thu, 21 Feb 2013 06:07:14 GMT
+1 (binding)

Thanks!

Cheers,
Chris

On 2/19/13 8:26 PM, "Arun C Murthy" <acm@hortonworks.com> wrote:

>Hi Folks,
>
>Thanks for participating in the discussion. I'd like to call a VOTE for
>acceptance of Apache Tez into the Incubator. I'll let the vote run till
>into this weekend (Sun 2/24 6pm PST).
>
>[ ]  +1 Accept Apache Tez into the Incubator
>[ ]  +0 Don't care.
>[ ]  -1 Don't accept Apache Tez into the Incubator because...
>
>Full proposal is pasted at the bottom of this email, and the
>corresponding wiki is http://wiki.apache.org/incubator/TezProposal.
>
>Only VOTEs from Incubator PMC members are binding, but all are welcome to
>express their thoughts.
>
>Here's my +1 (binding).
>
>thanks,
>Arun
>
>PS: From the initial discussion, the only changes are that I've added one
>new mentor and 2 new committers. All the new additions come from the
>non-major employer while we continue to strive to further diversify
>during the incubation. Thanks.
>
>----
>
>= Tez =
>
>== Abstract ==
>Tez is an effort to develop a generic application framework which can be
>used
>to process arbitrarily complex data-processing tasks and also a re-usable
>set
>of data-processing primitives which can be used by other projects.
>
>== Proposal ==
>Tez is a proposal to develop a generic application which can be used to
>process complex data-processing task DAGs and runs natively on Apache
>Hadoop 
>YARN. YARN is a generic resource-management system on which currently
>applications like MapReduce already exist. MapReduce is a specific, and
>constrained, DAG - which is not optimal for several frameworks like
>Apache Hive
>and Apache Pig. Furthermore, we propose to develop a re-usable set of
>libraries of data-processing primitives such as sorting, merging,
>data-shuffling, intermediate data management etc. which are necessary for
>Tez 
>which we envision can be used directly by other projects.
>
>== Background ==
>Apache Hadoop MapReduce has emerged as the assembly-language on which
>other
>frameworks like Apache Pig and Apache Hive have been built. However, it
>has
>been well accepted that MapReduce produces very constrained task DAGs for
>each
>job which results in Apache Pig and Apache Hive requiring multiple
>MapReduce
>jobs for several queries. By providing a more expressive DAG of tasks for
>a
>job, Tez attempts to provide significantly enhanced data-processing
>capabilities for projects like Apache Pig, Apache Hive, Cascading etc.
>
>== Rationale ==
>There is an important gap that Tez fulfills in the Apache Hadoop
>ecosystem of
>allowing for more expressive task DAGs for data-processing applications
>such
>as Apache Pig, Apache Hive, Cascading etc.
>
>With emergence of Apache Hadoop YARN, there is a strong need for a
>common DAG application which can then be shared by Apache Pig, Apache
>Hive,
>Cascading etc.
>
>== Initial Goals ==
>The initial goals for this project are to specify the detailed
>requirements
>and architecture, and then develop the initial implementation including
>the
>DAG ApplicationMaster to run natively inside Apache Hadoop YARN.
>
>== Current Status ==
>Significant work has been completed to identify the initial requirements
>and
>define the overall system architecture. There is a patch available in the
>internal Hortonworks git repository which can act as the initial seed.
>
>=== Meritocracy ===
>We plan to invest in supporting a meritocracy. We will discuss the
>requirements 
>in an open forum. Several companies have already expressed interest in
>this 
>project, and we intend to invite additional developers to participate.
>We will encourage and monitor community participation so that privileges
>can be 
>extended to those that contribute.
>
>=== Community ===
>The need for a generic DAG application for data processing in the open
>source is 
>tremendous, so there is a potential for a very large community. We believe
>that Tez's extensible architecture will further encourage community
>participation. 
>Also, related Apache projects (eg, Pig, Hive) have very large and active
>communities, and we expect that over time Tez will also attract a large
>community.
>
>=== Core Developers ===
>The developers on the initial committers list include people very
>experienced
>in the Apache Hadoop ecosystem:
>
> * Alan Gates <gates at apache dot org>
> * Arun C Murthy <acmurthy at apache dot org>
> * Ashutosh Chauhan <hashutosh at apache dot org>
> * Bikas Saha <bikas at apache dot org>
> * Chris Douglas <cdouglas at apache dot org>
> * Daryn Sharp <daryn at apache dot org>
> * Devaraj Das <ddas at apache dot org>
> * Gopal Vijayaraghavan <gopal at hortonworks dot com>
> * Gunther Hagleitner <ghagleitner at hortonworks dot com>
> * Hitesh Shah <hitesh at apache dot org>
> * Jason Lowe <jlowe at apache dot org>
> * Jean Xu <jeanxu at facebook dot com>
> * Jitendra Pandey <jitendra at apache dot org>
> * Julien Le Dem <julien at apache dot org>
> * Kevin Wilfong <kevinwilfong at apache dot org>
> * Mike Liddell <mike dot lidell at microsoft dot com>
> * Namit Jain <namit at apache dot org>
> * Nathan Roberts <nroberts at yahoo dash inc dot com>
> * Owen O'Malley <omalley at apache dot org>
> * Robert Evans <bobby at apache dot org>
> * Siddharth Seth <sseth at apache dot org>
> * Tom White <tomwhite at apache dot org>
> * Thomas Graves <tgraves at apache dot org>
> * Vikram Dixit <vikram at apache dot org>
> * Vinod Kumar Vavilapalli <vinodkv at apache dot org>
> * William Graham <billgraham at apache dot org>
>
>We realize that though we have significant employer diversity already,
>additional diversity is always better, and we will work
>aggressively to recruit developers from additional companies.
>
>=== Alignment ===
>The initial committers strongly believe that a standard task DAG
>application on Apache Hadoop YARN will gain broader adoption as an open
>source, 
>community driven project, where the community can contribute not only to
>the 
>core components, but also to a growing collection of applications which
>will
>be based on top of Tez. Our hope is that the Apache Hive, Apache Pig,
>Cascading and other communities will find tremendous value in Tez and
>will adopt 
>it en masse. 
>
>== Known Risks ==
>
>=== Orphaned Products ===
>The contributors are leading users and vendors in the Apache Hadoop
>ecosystem, 
>with significant open source experience, so the risk of being orphaned is
>relatively low. The project could be at risk if vendors decided to change
>their strategies in the market. In such an event, the current committers
>plan to continue working on the project on their own time, though the
>progress will likely be slower. We plan to mitigate this risk by
>recruiting additional committers.
>
>=== Inexperience with Open Source ===
>The initial committers include veteran Apache members (Committers, PMC
>members
>and Apache Members) and other developers who have varying degrees of
>experience 
>with open source projects. All have been involved with source code that
>has 
>been released under an open source license, and several also have
>experience 
>developing code with an open source development process.
>
>=== Homogenous Developers ===
>The initial committers are employed by a number of companies, including
>Cloudera, Facebook, Hortonworks, Microsoft, Twitter and Yahoo. We are
>committed 
>to recruiting additional committers from other companies based on their
>contributions to the project even though we do have significant diversity
>already. 
>
>=== Reliance on Salaried Developers ===
>It is expected that Tez development will occur on both salaried time and
>on 
>volunteer time, after hours. The majority of initial committers are paid
>by 
>their employer to contribute to this project. However, they are all
>passionate 
>about the project, and we are confident that the project will continue
>even if 
>no salaried developers contribute to the project. We are committed to
>recruiting 
>additional committers including non-salaried developers.
>
>=== Relationships with Other Apache Products ===
>As mentioned in the Alignment section, Tez is closely integrated with
>Hadoop,
>Hive and Pig in a numerous ways. We look forward to collaborating with
>those communities, as well as other Apache communities.
>
>=== An Excessive Fascination with the Apache Brand ===
>Tez solves a real need for generic task DAG management in the Apache
>Hadoop
>ecosystem, something which has been addressed in a very ad hoc manner so
>far
>by multiple Apache projects. Our rationale for developing Tez as an
>Apache 
>project is detailed in the Rationale section. We believe that the Apache
>brand 
>and community process will help us attract more contributors to this
>project, 
>and help establish ubiquitous APIs.
>
>== Documentation ==
>http://wiki.apache.org/incubator/TezProposal
>
>== Initial Source ==
>Available as a patch.
>
>== Cryptography ==
>Tez will eventually support encryption on the wire. This is not one of
>the initial 
>goals, and we do not expect Tez to be a controlled export item due to the
>use 
>of encryption.
>
>== Required Resources ==
>
>=== Mailing List ===
> * tez-private
> * tez-dev
> * tez-user
>
>=== Subversion Directory ===
>Git is the preferred source control system: git://git.apache.org/tez
>
>=== Issue Tracking ===
>
>JIRA Tez (TEZ) 
>
>== Initial Committers ==
> * Alan Gates <gates at apache dot org>
> * Arun C Murthy <acmurthy at apache dot org>
> * Ashutosh Chauhan <hashutosh at apache dot org>
> * Bikas Saha <bikas at apache dot org>
> * Chris Douglas <cdouglas at apache dot org>
> * Daryn Sharp <daryn at apache dot org>
> * Devaraj Das <ddas at apache dot org>
> * Gopal Vijayaraghavan <gopal at hortonworks dot com>
> * Gunther Hagleitner <ghagleitner at hortonworks dot com>
> * Hitesh Shah <hitesh at apache dot org>
> * Jason Lowe <jlowe at apache dot org>
> * Jean Xu <jeanxu at facebook dot com>
> * Jitendra Pandey <jitendra at apache dot org>
> * Julien Le Dem <julien at apache dot org>
> * Kevin Wilfong <kevinwilfong at apache dot org>
> * Mike Liddell <mike dot lidell at microsoft dot com>
> * Namit Jain <namit at apache dot org>
> * Nathan Roberts <nroberts at yahoo dash inc dot com>
> * Owen O'Malley <omalley at apache dot org>
> * Robert Evans <bobby at apache dot org>
> * Siddharth Seth <sseth at apache dot org>
> * Tom White <tomwhite at apache dot org>
> * Thomas Graves <tgraves at apache dot org>
> * Vikram Dixit <vikram at apache dot org>
> * Vinod Kumar Vavilapalli <vinodkv at apache dot org>
> * William Graham <billgraham at apache dot org>
>
>== Affiliations ==
>The initial committers are employees of Cloudera, Facebook, Hortonworks,
>Microsoft, Twitter and Yahoo Inc.
>
> * Alan Gates - Hortonworks
> * Arun C Murthy - Hortonworks
> * Ashutosh Chauhan - Hortonworks
> * Bikas Saha - Hortonworks
> * Chris Douglas - Microsoft
> * Daryn Sharp - Yahoo
> * Devaraj Das - Hortonworks
> * Gopal Vijayaraghavan - Hortonworks
> * Gunther Hagleitner - Hortonworks
> * Hitesh Shah - Hortonworks
> * Jason Lowe - Yahoo
> * Jean Xu - Facebook
> * Jitendra Pandey - Hortonworks
> * Julien Le Dem - Twitter
> * Kevin Wilfong - Facebook
> * Mike Liddell - Microsoft
> * Namit Jain - Facebook
> * Nathan Roberts - Yahoo
> * Owen O'Malley - Hortonworks
> * Robert Evans - Yahoo
> * Siddharth Seth - Hortonworks
> * Tom White - Cloudera
> * Thomas Graves - Yahoo
> * Vikram Dixit - Hortonworks
> * Vinod Kumar Vavilapalli - Hortonworks
> * William Graham - Twitter
>
>The nominated mentors are employees of Hortonworks, LinkedIn,
>NASA JPL and Microsoft.
> 
> * Alan Gates - Hortonworks
> * Arun C Murthy - Hortonworks
> * Chris Douglas - Microsoft
> * Chris Mattman - NASA JPL
> * Jakob Homan - LinkedIn
> * Owen O'Malley - Hortonworks
>
>== Sponsors ==
>
>=== Champion ===
>Arun C Murthy <acmurthy at apache dot org>
>
>=== Nominated Mentors ===
> * Alan Gates <gates at apache dot org> – Architect at Hortonworks.
>Committer for Pig.
> * Arun C Murthy <acmurthy at apache dot org> – Architect at
>Hortonworks. Committer for Hadoop.
> * Chris Douglas <cdouglas at apache dot org> - Sr. Research Engineer at
>Microsoft. Committer for Hadoop.
> * Chris Mattman <mattmann at apache dot org> - Sr. Computer Scientist,
>NASA JPL. Committer for Nutch, OODT and Tika.
> * Jakob Homan <jghoman at apache dot org> – Sr. Software Engineer,
>LinkedIn. Committer for Hadoop, Kafka, Giraph.
> * Owen O'Malley <omalley at apache dot org> – Architect at
>Hortonworks. Committer for Hadoop, Ambari.
>
>=== Sponsoring Entity ===
>Incubator
>

Mime
View raw message