Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2F5C51866B for ; Tue, 24 Nov 2015 21:23:23 +0000 (UTC) Received: (qmail 49392 invoked by uid 500); 24 Nov 2015 21:23:22 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 49210 invoked by uid 500); 24 Nov 2015 21:23:22 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 48904 invoked by uid 99); 24 Nov 2015 21:23:21 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2015 21:23:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 69BA4C2D82 for ; Tue, 24 Nov 2015 21:23:21 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id e1hGVckDZFDY for ; Tue, 24 Nov 2015 21:23:09 +0000 (UTC) Received: from mail-qg0-f47.google.com (mail-qg0-f47.google.com [209.85.192.47]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id DCD6E22F0A for ; Tue, 24 Nov 2015 21:23:08 +0000 (UTC) Received: by qgea14 with SMTP id a14so19943765qge.0 for ; Tue, 24 Nov 2015 13:23:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=/8ys3FE6dNm/+ZDGHm4Z8sUC8V7ViLz3l68AdjXs/Bw=; b=zEihJ94W7JPdAaTFFxsdQDipXOxavV/bn4rLivWeNWA6KnWSoy3mpOoRL1yfWBS83J +w58fGRE+Mj0tWTfDSqqK5MxfPkvy2kgbPS+Oj+yoItV45kzinIy8ixSSZQtWsjMMe76 oeSHaqT0pGD5gRfcpEG1avIl9dU5Iia9HFPJEr/OWyC6/s5Xn//F1jbDkE8jCPrryKO5 mp8adGQDYZUAFpSuPQgGQ0r/s9Lsu1HTmN9DvOnFVwC6FjYK5jyl72fddJlvIt/9rYEG VZ7bfD08p3O/VqGx/BOUxLmOa3/yF9TcmPkSQ5+EMVUU2P5/3PPSsUuIr09Rmdc8oy+g rvlA== MIME-Version: 1.0 X-Received: by 10.140.100.203 with SMTP id s69mr35669609qge.47.1448400187965; Tue, 24 Nov 2015 13:23:07 -0800 (PST) Received: by 10.55.122.131 with HTTP; Tue, 24 Nov 2015 13:23:07 -0800 (PST) In-Reply-To: References: Date: Tue, 24 Nov 2015 16:23:07 -0500 Message-ID: Subject: Re: [VOTE] Accept Impala into the Apache Incubator From: Patrick Angeles To: general@incubator.apache.org Content-Type: multipart/alternative; boundary=001a1134f860b247c905254ff2da --001a1134f860b247c905254ff2da Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable +1 (non binding) On Tue, Nov 24, 2015 at 4:21 PM, Arvind Prabhakar wrote= : > +1 (binding) > > Regards, > Arvind Prabhakar > > On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson > wrote: > > > Hi - > > > > The [DISCUSS] thread has been quiet for a few days, so I think there's > been > > sufficient opportunity for discussion around our proposal to bring Impa= la > > to the ASF Incubator. > > > > I'd like to call a VOTE on that proposal, which is on the wiki at > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted > > below. > > > > During the discussion period, the proposal has been amended to add Broc= k > > Noland as a new mentor, to add one missed committer from the list and t= o > > correct some issues with the dependency list. > > > > Please cast your votes as follows: > > > > [] +1, accept Impala into the Incubator > > [] +/-0, non-counted vote to express a disposition > > [] -1, do not accept Impala into the Incubator (please give your > reason(s)) > > > > As with the concurrent Kudu vote, I propose leaving the vote open for a > > full seven days (to close at Tuesday, December 1st at noon PST), due to > the > > upcoming US holiday. > > > > Thanks, > > Henry > > > > -------- > > > > =3D Abstract =3D > > Impala is a high-performance C++ and Java SQL query engine for data > stored > > in Apache Hadoop-based clusters. > > > > =3D Proposal =3D > > > > We propose to contribute the Impala codebase and associated artifacts > (e.g. > > documentation, web-site content etc.) to the Apache Software Foundation > > with the intent of forming a productive, meritocratic and open communit= y > > around Impala=E2=80=99s continued development, according to the =E2=80= =98Apache Way=E2=80=99. > > > > Cloudera owns several trademarks regarding Impala, and proposes to > transfer > > ownership of those trademarks in full to the ASF. > > > > =3D Background =3D > > Engineers at Cloudera developed Impala and released it as an > > Apache-licensed open-source project in Fall 2012. Impala was written as= a > > brand-new, modern C++ SQL engine targeted from the start for data store= d > in > > Apache Hadoop clusters. > > > > Impala=E2=80=99s most important benefit to users is high-performance, m= aking it > > extremely appropriate for common enterprise analytic and business > > intelligence workloads. This is achieved by a number of software > > techniques, including: native support for data stored in HDFS and relat= ed > > filesystems, just-in-time compilation and optimization of individual > query > > plans, high-performance C++ codebase and massively-parallel distributed > > architecture. In benchmarks, Impala is routinely amongst the very highe= st > > performing SQL query engines. > > > > =3D Rationale =3D > > > > Despite the exciting innovation in the so-called =E2=80=98big-data=E2= =80=99 space, SQL > > remains by far the most common interface for interacting with data in > both > > traditional warehouses and modern =E2=80=98big-data=E2=80=99 clusters. = There is clearly a > > need, as evidenced by the eager adoption of Impala and other SQL engine= s > in > > enterprise contexts, for a query engine that offers the familiar SQL > > interface, but that has been specifically designed to operate in massiv= e, > > distributed clusters rather than in traditional, fixed-hardware, > > warehouse-specific deployments. Impala is one such query engine. > > > > We believe that the ASF is the right venue to foster an open-source > > community around Impala=E2=80=99s development. We expect that Impala wi= ll benefit > > from more productive collaboration with related Apache projects, and > under > > the auspices of the ASF will attract talented contributors who will pus= h > > Impala=E2=80=99s development forward at pace. > > > > We believe that the timing is right for Impala=E2=80=99s development to= move > > wholesale to the ASF: Impala is well-established, has been > Apache-licensed > > open-source for more than three years, and the core project is relative= ly > > stable. We are excited to see where an ASF-based community can take > Impala > > from this strong starting point. > > > > =3D Initial Goals =3D > > Our initial goals are as follows: > > > > * Establish ASF-compatible engineering practices and workflows > > * Refactor and publish existing internal build scripts and test > > infrastructure, in order to make them usable by any community member. > > * Transfer source code, documentation and associated artifacts to the > ASF. > > * Grow the user and developer communities > > > > =3D Current Status =3D > > > > Impala is developed as an Apache-licensed open-source project. The sour= ce > > code is available at http://github.com/cloudera/Impala, and developer > > documentation is at https://github.com/cloudera/Impala/wiki. The > majority > > of commits to the project have come from Cloudera-employed developers, > but > > we have accepted some contributions from individuals from other > > organizations. > > > > All code reviews are done via a public instance of the Gerrit review to= ol > > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing > > list. All patches must be reviewed before they are accepted into the > > codebase, via a voting mechanism that is similar to that used on Apache > > projects such as Hadoop and HBase. > > > > Before a patch is committed, it must pass a suite of pre-commit tests. > > These tests are currently run on Cloudera=E2=80=99s internal infrastruc= ture. One > of > > our initial goals will be to work with the ASF Infrastructure team to > find > > a way to run these tests in an acceptable way on publicly accessible > > machines. > > > > Issues are tracked in JIRA at > https://issues.cloudera.org/projects/IMPALA, > > in a way that is extremely similar to existing practices at other ASF > > projects. > > > > =3D Meritocracy =3D > > > > We understand the central importance of meritocracy to the Apache Way. = We > > will work to establish a welcoming, fair and meritocratic community, in > > part by expanding the set of committers on the project. Although Impala= =E2=80=99s > > committer list will initially be dominated by members of the Impala > > engineering team at Cloudera, we look forward to growing a rich user an= d > > developer community. > > > > =3D Community =3D > > Impala has a strong user community (see > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), > and a > > growing developer community (see > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We > > wish > > to attract more developers to the project, and we believe that the ASF= =E2=80=99s > > open and meritocratic philosophy will help us with this. We note the > > success of other, similar projects already part of the ASF. > > > > =3D Core Developers =3D > > Most - but not all - of Impala=E2=80=99s core developers are not curren= tly > > affiliated with the ASF, and will require new ICLAs. > > > > =3D Alignment =3D > > Impala is related to several other Apache projects: > > > > * Data that is read by Impala is very often stored in Apache Hadoop > > clusters powered by the HDFS filesystem. > > * Impala can also read data stored in Apache HBase > > * Metadata for databases, tables and so on is read by Impala from Apac= he > > Hive. > > * The preferred data format for HDFS-based tables is Apache Parquet, a= nd > > Apache Avro is also a supported data format. > > * Impala is closely integrated with Kudu, which is also being proposed > to > > the Incubator. > > * Impala uses Apache Thrift as its RPC and serialization framework of > > choice. > > > > =3D Known Risks =3D > > > > =3D=3D Orphaned Products =3D=3D > > Impala is used by most of Cloudera=E2=80=99s customers, and Cloudera re= mains > > committed to developing and supporting the project. Cloudera has a stro= ng > > track record in standing behind projects that were contributed to the A= SF > > by its employees, including Apache Flume, Apache Sqoop, and others. Oth= er > > companies both ship and support Impala, lending credence to the idea th= at > > Impala is not at risk of being suddenly orphaned. > > > > =3D=3D Inexperience with Open Source =3D=3D > > Although all committers on the initial list have significant experience > > with at least one open-source project - namely Impala - fewer have much > > experience with ASF-based software projects as contributors and communi= ty > > members. However, with the guidance of our mentors, committers who do > have > > ASF experience, and time to learn during Incubation, we are confident > that > > the project can be run in accordance with Apache principles on an ongoi= ng > > basis. > > > > =3D=3D Homogeneous Developers =3D=3D > > > > The initial committers are employees of Cloudera. > > > > The project has received some contributions from developers outside of > > Cloudera, from individuals belonging to organizations such as Intel and > > Google, from hobbyists and from students using Impala to advance their > > understanding of distributed databases. The project attracted an active > > user community as well. We hope to continue to encourage contributions > from > > these developers and community members and grow them into committers > after > > they have had time to continue their contributions. > > > > =3D=3D Reliance on Salaried Developers =3D=3D > > > > Many of Impala=E2=80=99s initial set of committers work full-time on Im= pala, and > > are paid to do so. However, as mentioned elsewhere, we anticipate growt= h > in > > the developer community which we hope will include hobbyists and > academics > > who have an interested in distributed data systems. > > > > =3D=3D An Excessive Fascination with the Apache Brand =3D=3D > > Although we hope that Impala benefits from the Apache Brand, any > reflected > > goodwill to Cloudera as the contributing entity is not the goal of > > establishing Impala as an Apache project. We will work with the Incubat= or > > PMC and the PRC to ensure that the Apache Brand is respected. > > > > =3D Documentation =3D > > Impala: A Modern, Open-Source SQL Engine for Hadoop ( > > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf) > > > > Impala=E2=80=99s developer wiki (https://github.com/cloudera/Impala/wik= i) > > > > Impala=E2=80=99s auto-generated API documentation ( > > http://impala.io/doc/html/index.html) > > > > =3D Initial Source =3D > > Impala=E2=80=99s initial source contribution will come from > > http://github.com/cloudera/Impala/. > > > > =3D External Dependencies =3D > > > > Impala depends upon a number of third-party libraries, which we list > below. > > We intend to compile a LICENSE.txt file in the very short term (see > > https://issues.cloudera.org/browse/IMPALA-2670). > > > > * Google gflags (BSD) > > * Google glog (BSD) > > * Apache Thrift (Apache Software License v2.0) > > * Apache Commons (Apache Software License v2.0) > > * Apache Hadoop (Apache Software License v2.0) > > * Apache HBase (Apache Software License v2.0) > > * Apache Hive (Apache Software License v2.0) > > * Boost (Boost Software License) > > * OpenLdap (OpenLDAP Software License) > > * rapidjson (MIT) > > * Google RE2 (BSD-style) > > * lz4 (BSD) > > * snappy (BSD) > > * cyrus-sasl (CMU License) > > * Apache Avro (Apache Software License v2.0) > > * Cloudera squeasel (Apache Software License v2.0) > > * Apache htrace (Incubating) (Apache Software License v2.0) > > * Apache Sentry (Incubating) (Apache Software License v2.0) > > * Apache Shiro (Apache Software License v2.0) > > * Twitter Bootstrap (Apache Software License v2.0) > > * d3 (BSD) > > * LLVM (BSD-like) > > > > Build and test dependencies: > > > > * ant (Apache Software License v2.0) > > * Apache Maven (Apache Software License v2.0) > > * cmake (BSD) > > * clang (BSD) > > * Google gtest (Apache Software License v2.0) > > > > =3D Required Resources =3D > > > > We request that following resources be created for the project to use: > > > > =3D=3D Mailing lists =3D=3D > > > > * private@impala.incubator.apache.org (moderated subscriptions) > > * commits@impala.incubator.apache.org > > * dev@impala.incubator.apache.org > > * issues@impala.incubator.apache.org > > * user@impala.incubator.apache.org > > > > =3D=3D Git repository =3D=3D > > https://git.apache.org/impala.git > > > > =3D=3D JIRA instance =3D=3D > > JIRA project IMPALA (IMPALA or IMP) > > > > =3D=3D Other Resources =3D=3D > > We hope to continue using Gerrit for our code review and commit workflo= w. > > We are involved with discussions that the Kudu team at Cloudera have be= en > > having with Jake Farrell to start discussions on how Gerrit can fit int= o > > the ASF. We know that several other ASF projects or podlings are also > > interested in Gerrit. > > > > If the Infrastructure team does not have the bandwidth to support gerri= t, > > we will continue to support our own instance of gerrit for Impala, and > make > > the necessary integrations such that commits are properly authenticated > and > > maintain sufficient provenance to uphold the ASF standards (e.g. via th= e > > solution adopted by the AsterixDB podling). > > > > =3D Initial Committers =3D > > > > * Tim Armstrong > > * Alex Behm > > * Taras Bobrovytsky > > * Casey Ching > > * Martin Grund > > * Daniel Hecht > > * Michael Ho > > * Matthew Jacobs > > * Ishaan Joshi > > * Lenni Kuff > > * Marcel Kornacker > > * Sailesh Mukil > > * Henry Robinson > > * John Russell > > * Dimitris Tsirogiannis > > * Skye Wanderman-Milne > > * Juan Yu > > > > =3D=3D Affiliations =3D=3D > > All: Cloudera Inc. > > > > =3D Sponsors =3D > > > > =3D=3D Champion =3D=3D > > Tom White > > > > =3D=3D Nominated Mentors =3D=3D > > * Tom White (Cloudera) > > * Todd Lipcon (Cloudera) > > * Carl Steinbach (LinkedIn) > > * Brock Noland (StreamSets) > > > > > > =3D Sponsoring Entity =3D > > We ask that the Incubator PMC sponsor this proposal. > > > --001a1134f860b247c905254ff2da--