Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F24B8E18D for ; Thu, 21 Feb 2013 18:42:35 +0000 (UTC) Received: (qmail 45517 invoked by uid 500); 21 Feb 2013 18:42:35 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 45334 invoked by uid 500); 21 Feb 2013 18:42:35 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 45324 invoked by uid 99); 21 Feb 2013 18:42:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2013 18:42:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vinodkv@hortonworks.com designates 209.85.210.52 as permitted sender) Received: from [209.85.210.52] (HELO mail-da0-f52.google.com) (209.85.210.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2013 18:42:24 +0000 Received: by mail-da0-f52.google.com with SMTP id f10so4200656dak.11 for ; Thu, 21 Feb 2013 10:42:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:mime-version:content-type:subject:date:in-reply-to :to:references:message-id:x-mailer:x-gm-message-state; bh=KZXjSLyrVWW/KDjrvvC7K+nHC9pxS0ohOcHkXm6QCuQ=; b=ecNLXyoQNbstRwUCJwbIHMXwUG2G/2SZ1NPClR9b0RH6GPsw/8wMDjh9IDQiWnibZ2 Axr4PU2CRU3+n5NGGFdjmy+JXVTCAh9vaH6QVGu/Uh/tJCwQDOX4jYk6WFMJrp4yITWZ Lg9sVJgNajKh3XQMNPlv49xKd0RB8Meup9WJEbqyBLm4//m2VeN0O78qqzdzpPu/pRgF DsPaXLJRvF92Ie0n2rRm5+fa1Y3RMClRq0rrG9gjj2Z7oZdBKFJC7Og9Salm+S7vUzE8 V0SSxqzoWgKULvuS8WTGOXpanbdqAbMBaYLmV9u6KbIUjr6nhZKYYFJY5P1YGHqUd9gb 99Gw== X-Received: by 10.66.162.196 with SMTP id yc4mr9798706pab.137.1361472122972; Thu, 21 Feb 2013 10:42:02 -0800 (PST) Received: from [10.11.3.32] (host1.hortonworks.com. [70.35.59.2]) by mx.google.com with ESMTPS id 1sm14589649pba.32.2013.02.21.10.42.01 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 21 Feb 2013 10:42:02 -0800 (PST) From: Vinod Kumar Vavilapalli Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/alternative; boundary="Apple-Mail=_D0C14B3F-FABB-4A2A-9DF7-27AABE11BCD4" Subject: Re: [VOTE] Accept Tez into Incubator Date: Thu, 21 Feb 2013 10:42:01 -0800 In-Reply-To: <8F1FC886-8A6F-49AE-8F48-61A22EA19209@hortonworks.com> To: general@incubator.apache.org References: <8F1FC886-8A6F-49AE-8F48-61A22EA19209@hortonworks.com> Message-Id: <9BB75BEE-65A1-4DE6-8020-298A45856929@apache.org> X-Mailer: Apple Mail (2.1283) X-Gm-Message-State: ALoCoQnp/NLcmTSgaGN2Z7smLt2lh5cuH2OVKDY0D0fLDPwfOUsXovGufMSh6WSKJjUshUzB4NAs X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_D0C14B3F-FABB-4A2A-9DF7-27AABE11BCD4 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 +1 (non-binding) Thanks, +Vinod On Feb 19, 2013, at 8:26 PM, Arun C Murthy wrote: > Hi Folks, >=20 > Thanks for participating in the discussion. I'd like to call a VOTE = for acceptance of Apache Tez into the Incubator. I'll let the vote run = till into this weekend (Sun 2/24 6pm PST). >=20 > [ ] +1 Accept Apache Tez into the Incubator > [ ] +0 Don't care. > [ ] -1 Don't accept Apache Tez into the Incubator because... >=20 > Full proposal is pasted at the bottom of this email, and the = corresponding wiki is http://wiki.apache.org/incubator/TezProposal.=20 >=20 > Only VOTEs from Incubator PMC members are binding, but all are welcome = to express their thoughts. >=20 > Here's my +1 (binding). >=20 > thanks, > Arun >=20 > PS: =46rom the initial discussion, the only changes are that I've = added one new mentor and 2 new committers. All the new additions come = from the non-major employer while we continue to strive to further = diversify during the incubation. Thanks. >=20 > ---- >=20 > =3D Tez =3D >=20 > =3D=3D Abstract =3D=3D > Tez is an effort to develop a generic application framework which can = be used > to process arbitrarily complex data-processing tasks and also a = re-usable set > of data-processing primitives which can be used by other projects. >=20 > =3D=3D Proposal =3D=3D > Tez is a proposal to develop a generic application which can be used = to > process complex data-processing task DAGs and runs natively on Apache = Hadoop=20 > YARN. YARN is a generic resource-management system on which currently=20= > applications like MapReduce already exist. MapReduce is a specific, = and > constrained, DAG - which is not optimal for several frameworks like = Apache Hive > and Apache Pig. Furthermore, we propose to develop a re-usable set of > libraries of data-processing primitives such as sorting, merging, > data-shuffling, intermediate data management etc. which are necessary = for Tez=20 > which we envision can be used directly by other projects.=20 >=20 > =3D=3D Background =3D=3D > Apache Hadoop MapReduce has emerged as the assembly-language on which = other > frameworks like Apache Pig and Apache Hive have been built. However, = it has > been well accepted that MapReduce produces very constrained task DAGs = for each > job which results in Apache Pig and Apache Hive requiring multiple = MapReduce > jobs for several queries. By providing a more expressive DAG of tasks = for a > job, Tez attempts to provide significantly enhanced data-processing > capabilities for projects like Apache Pig, Apache Hive, Cascading etc. >=20 > =3D=3D Rationale =3D=3D > There is an important gap that Tez fulfills in the Apache Hadoop = ecosystem of > allowing for more expressive task DAGs for data-processing = applications such > as Apache Pig, Apache Hive, Cascading etc. >=20 > With emergence of Apache Hadoop YARN, there is a strong need for a > common DAG application which can then be shared by Apache Pig, Apache = Hive, > Cascading etc. >=20 > =3D=3D Initial Goals =3D=3D > The initial goals for this project are to specify the detailed = requirements > and architecture, and then develop the initial implementation = including the > DAG ApplicationMaster to run natively inside Apache Hadoop YARN.=20 >=20 > =3D=3D Current Status =3D=3D > Significant work has been completed to identify the initial = requirements and > define the overall system architecture. There is a patch available in = the > internal Hortonworks git repository which can act as the initial seed.=20= >=20 > =3D=3D=3D Meritocracy =3D=3D=3D > We plan to invest in supporting a meritocracy. We will discuss the = requirements=20 > in an open forum. Several companies have already expressed interest in = this=20 > project, and we intend to invite additional developers to participate.=20= > We will encourage and monitor community participation so that = privileges can be=20 > extended to those that contribute.=20 >=20 > =3D=3D=3D Community =3D=3D=3D > The need for a generic DAG application for data processing in the open = source is=20 > tremendous, so there is a potential for a very large community. We = believe > that Tez's extensible architecture will further encourage community = participation.=20 > Also, related Apache projects (eg, Pig, Hive) have very large and = active=20 > communities, and we expect that over time Tez will also attract a = large community. >=20 > =3D=3D=3D Core Developers =3D=3D=3D > The developers on the initial committers list include people very = experienced > in the Apache Hadoop ecosystem: >=20 > * Alan Gates > * Arun C Murthy > * Ashutosh Chauhan > * Bikas Saha > * Chris Douglas > * Daryn Sharp > * Devaraj Das > * Gopal Vijayaraghavan > * Gunther Hagleitner > * Hitesh Shah > * Jason Lowe > * Jean Xu > * Jitendra Pandey > * Julien Le Dem > * Kevin Wilfong > * Mike Liddell > * Namit Jain > * Nathan Roberts > * Owen O'Malley > * Robert Evans > * Siddharth Seth > * Tom White > * Thomas Graves > * Vikram Dixit > * Vinod Kumar Vavilapalli > * William Graham >=20 > We realize that though we have significant employer diversity already,=20= > additional diversity is always better, and we will work=20 > aggressively to recruit developers from additional companies. >=20 > =3D=3D=3D Alignment =3D=3D=3D > The initial committers strongly believe that a standard task DAG=20 > application on Apache Hadoop YARN will gain broader adoption as an = open source,=20 > community driven project, where the community can contribute not only = to the=20 > core components, but also to a growing collection of applications = which will > be based on top of Tez. Our hope is that the Apache Hive, Apache Pig, > Cascading and other communities will find tremendous value in Tez and = will adopt=20 > it en masse.=20 >=20 > =3D=3D Known Risks =3D=3D >=20 > =3D=3D=3D Orphaned Products =3D=3D=3D > The contributors are leading users and vendors in the Apache Hadoop = ecosystem,=20 > with significant open source experience, so the risk of being orphaned = is=20 > relatively low. The project could be at risk if vendors decided to = change=20 > their strategies in the market. In such an event, the current = committers=20 > plan to continue working on the project on their own time, though the=20= > progress will likely be slower. We plan to mitigate this risk by=20 > recruiting additional committers. >=20 > =3D=3D=3D Inexperience with Open Source =3D=3D=3D > The initial committers include veteran Apache members (Committers, PMC = members > and Apache Members) and other developers who have varying degrees of = experience=20 > with open source projects. All have been involved with source code = that has=20 > been released under an open source license, and several also have = experience=20 > developing code with an open source development process. >=20 > =3D=3D=3D Homogenous Developers =3D=3D=3D > The initial committers are employed by a number of companies, = including > Cloudera, Facebook, Hortonworks, Microsoft, Twitter and Yahoo. We are = committed=20 > to recruiting additional committers from other companies based on = their=20 > contributions to the project even though we do have significant = diversity > already.=20 >=20 > =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > It is expected that Tez development will occur on both salaried time = and on=20 > volunteer time, after hours. The majority of initial committers are = paid by=20 > their employer to contribute to this project. However, they are all = passionate=20 > about the project, and we are confident that the project will continue = even if=20 > no salaried developers contribute to the project. We are committed to = recruiting=20 > additional committers including non-salaried developers. >=20 > =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D > As mentioned in the Alignment section, Tez is closely integrated with = Hadoop, > Hive and Pig in a numerous ways. We look forward to collaborating with=20= > those communities, as well as other Apache communities.=20 >=20 > =3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D > Tez solves a real need for generic task DAG management in the Apache = Hadoop > ecosystem, something which has been addressed in a very ad hoc manner = so far > by multiple Apache projects. Our rationale for developing Tez as an = Apache=20 > project is detailed in the Rationale section. We believe that the = Apache brand=20 > and community process will help us attract more contributors to this = project,=20 > and help establish ubiquitous APIs.=20 >=20 > =3D=3D Documentation =3D=3D > http://wiki.apache.org/incubator/TezProposal >=20 > =3D=3D Initial Source =3D=3D > Available as a patch. >=20 > =3D=3D Cryptography =3D=3D > Tez will eventually support encryption on the wire. This is not one of = the initial=20 > goals, and we do not expect Tez to be a controlled export item due to = the use=20 > of encryption. >=20 > =3D=3D Required Resources =3D=3D >=20 > =3D=3D=3D Mailing List =3D=3D=3D > * tez-private > * tez-dev > * tez-user >=20 > =3D=3D=3D Subversion Directory =3D=3D=3D > Git is the preferred source control system: git://git.apache.org/tez >=20 > =3D=3D=3D Issue Tracking =3D=3D=3D >=20 > JIRA Tez (TEZ)=20 >=20 > =3D=3D Initial Committers =3D=3D > * Alan Gates > * Arun C Murthy > * Ashutosh Chauhan > * Bikas Saha > * Chris Douglas > * Daryn Sharp > * Devaraj Das > * Gopal Vijayaraghavan > * Gunther Hagleitner > * Hitesh Shah > * Jason Lowe > * Jean Xu > * Jitendra Pandey > * Julien Le Dem > * Kevin Wilfong > * Mike Liddell > * Namit Jain > * Nathan Roberts > * Owen O'Malley > * Robert Evans > * Siddharth Seth > * Tom White > * Thomas Graves > * Vikram Dixit > * Vinod Kumar Vavilapalli > * William Graham >=20 > =3D=3D Affiliations =3D=3D > The initial committers are employees of Cloudera, Facebook, = Hortonworks, > Microsoft, Twitter and Yahoo Inc.=20 >=20 > * Alan Gates - Hortonworks=20 > * Arun C Murthy - Hortonworks=20 > * Ashutosh Chauhan - Hortonworks=20 > * Bikas Saha - Hortonworks=20 > * Chris Douglas - Microsoft=20 > * Daryn Sharp - Yahoo=20 > * Devaraj Das - Hortonworks=20 > * Gopal Vijayaraghavan - Hortonworks=20 > * Gunther Hagleitner - Hortonworks=20 > * Hitesh Shah - Hortonworks=20 > * Jason Lowe - Yahoo=20 > * Jean Xu - Facebook=20 > * Jitendra Pandey - Hortonworks=20 > * Julien Le Dem - Twitter > * Kevin Wilfong - Facebook=20 > * Mike Liddell - Microsoft=20 > * Namit Jain - Facebook=20 > * Nathan Roberts - Yahoo=20 > * Owen O'Malley - Hortonworks > * Robert Evans - Yahoo=20 > * Siddharth Seth - Hortonworks=20 > * Tom White - Cloudera=20 > * Thomas Graves - Yahoo=20 > * Vikram Dixit - Hortonworks=20 > * Vinod Kumar Vavilapalli - Hortonworks=20 > * William Graham - Twitter=20 >=20 > The nominated mentors are employees of Hortonworks, LinkedIn,=20 > NASA JPL and Microsoft. >=20 > * Alan Gates - Hortonworks=20 > * Arun C Murthy - Hortonworks=20 > * Chris Douglas - Microsoft=20 > * Chris Mattman - NASA JPL=20 > * Jakob Homan - LinkedIn=20 > * Owen O'Malley - Hortonworks=20 >=20 > =3D=3D Sponsors =3D=3D >=20 > =3D=3D=3D Champion =3D=3D=3D > Arun C Murthy >=20 > =3D=3D=3D Nominated Mentors =3D=3D=3D > * Alan Gates =82=C4=EC Architect at = Hortonworks. Committer for Pig.=20 > * Arun C Murthy =82=C4=EC Architect at = Hortonworks. Committer for Hadoop.=20 > * Chris Douglas - Sr. Research Engineer = at Microsoft. Committer for Hadoop.=20 > * Chris Mattman - Sr. Computer Scientist, = NASA JPL. Committer for Nutch, OODT and Tika. =20 > * Jakob Homan =82=C4=EC Sr. Software = Engineer, LinkedIn. Committer for Hadoop, Kafka, Giraph. > * Owen O'Malley =82=C4=EC Architect at = Hortonworks. Committer for Hadoop, Ambari.=20 >=20 > =3D=3D=3D Sponsoring Entity =3D=3D=3D > Incubator >=20 --Apple-Mail=_D0C14B3F-FABB-4A2A-9DF7-27AABE11BCD4--