Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2BBB718F7E for ; Thu, 26 Nov 2015 19:52:35 +0000 (UTC) Received: (qmail 61108 invoked by uid 500); 26 Nov 2015 19:52:34 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 60911 invoked by uid 500); 26 Nov 2015 19:52:34 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 60896 invoked by uid 99); 26 Nov 2015 19:52:33 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Nov 2015 19:52:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 36D531A21A5 for ; Thu, 26 Nov 2015 19:52:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.099 X-Spam-Level: X-Spam-Status: No, score=-0.099 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id tPyOiS7blRL8 for ; Thu, 26 Nov 2015 19:52:20 +0000 (UTC) Received: from mail-io0-f170.google.com (mail-io0-f170.google.com [209.85.223.170]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 878D920C34 for ; Thu, 26 Nov 2015 19:52:20 +0000 (UTC) Received: by ioc74 with SMTP id 74so94830238ioc.2 for ; Thu, 26 Nov 2015 11:52:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=Wz5+Qrpq357w34OmZ4gmITbnv3I0riiLw2feLTVGY3w=; b=hBznhxYIxselNNlLvI99NpDG79hlgP3sAKODSFGlGTYMspOPyks1wQZUlFxPT9mxmm 4x3rh9uQXIQQ8mz/7i4eYBRTh3w2yrp3NIH+QcXwAnkPTaBQuzXQPf6InGWmnXZ3ay4o ZrJuhZbGkqscKMu/3dluoGj3Jtydoqpo6ZmjmLRJ13NVp1TXodYQlFOy2+zVm/FFl0aX Vn5QismCwC/kGUHposcVUcimcQfkpTBGdttkRj/5CONpAzHb8L7DBM9xpUSp4Ly8ymB2 O1EKx6k8BNGEc7ntEFBnS2lUcRdBMJmugZD5kuHgnK2VJTegL3kb2qppH3U5A/N6Am7Y 1USQ== MIME-Version: 1.0 X-Received: by 10.107.1.196 with SMTP id 187mr49108488iob.167.1448567539915; Thu, 26 Nov 2015 11:52:19 -0800 (PST) Received: by 10.36.219.65 with HTTP; Thu, 26 Nov 2015 11:52:19 -0800 (PST) In-Reply-To: <20151126195011.GR28434@tpx> References: <20151126194734.GQ28434@tpx> <20151126195011.GR28434@tpx> Date: Thu, 26 Nov 2015 14:52:19 -0500 Message-ID: Subject: Re: [VOTE] Accept Impala into the Apache Incubator From: Joe Witt To: general@incubator.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable +1 (non-binding) On Thu, Nov 26, 2015 at 2:50 PM, Konstantin Boudnik wrote: > Come to think of it a bit more, yes I am not satisfied with the outcome o= f > the CTR/RTC exchange in the project. > > Hence changing my vote to > -1 [binding] > > On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote: >> -0 [binding] >> >> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote: >> > Hi - >> > >> > The [DISCUSS] thread has been quiet for a few days, so I think there's= been >> > sufficient opportunity for discussion around our proposal to bring Imp= ala >> > to the ASF Incubator. >> > >> > I'd like to call a VOTE on that proposal, which is on the wiki at >> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've paste= d >> > below. >> > >> > During the discussion period, the proposal has been amended to add Bro= ck >> > Noland as a new mentor, to add one missed committer from the list and = to >> > correct some issues with the dependency list. >> > >> > Please cast your votes as follows: >> > >> > [] +1, accept Impala into the Incubator >> > [] +/-0, non-counted vote to express a disposition >> > [] -1, do not accept Impala into the Incubator (please give your reaso= n(s)) >> > >> > As with the concurrent Kudu vote, I propose leaving the vote open for = a >> > full seven days (to close at Tuesday, December 1st at noon PST), due t= o the >> > upcoming US holiday. >> > >> > Thanks, >> > Henry >> > >> > -------- >> > >> > =3D Abstract =3D >> > Impala is a high-performance C++ and Java SQL query engine for data st= ored >> > in Apache Hadoop-based clusters. >> > >> > =3D Proposal =3D >> > >> > We propose to contribute the Impala codebase and associated artifacts = (e.g. >> > documentation, web-site content etc.) to the Apache Software Foundatio= n >> > with the intent of forming a productive, meritocratic and open communi= ty >> > around Impala=E2=80=99s continued development, according to the =E2=80= =98Apache Way=E2=80=99. >> > >> > Cloudera owns several trademarks regarding Impala, and proposes to tra= nsfer >> > ownership of those trademarks in full to the ASF. >> > >> > =3D Background =3D >> > Engineers at Cloudera developed Impala and released it as an >> > Apache-licensed open-source project in Fall 2012. Impala was written a= s a >> > brand-new, modern C++ SQL engine targeted from the start for data stor= ed in >> > Apache Hadoop clusters. >> > >> > Impala=E2=80=99s most important benefit to users is high-performance, = making it >> > extremely appropriate for common enterprise analytic and business >> > intelligence workloads. This is achieved by a number of software >> > techniques, including: native support for data stored in HDFS and rela= ted >> > filesystems, just-in-time compilation and optimization of individual q= uery >> > plans, high-performance C++ codebase and massively-parallel distribute= d >> > architecture. In benchmarks, Impala is routinely amongst the very high= est >> > performing SQL query engines. >> > >> > =3D Rationale =3D >> > >> > Despite the exciting innovation in the so-called =E2=80=98big-data=E2= =80=99 space, SQL >> > remains by far the most common interface for interacting with data in = both >> > traditional warehouses and modern =E2=80=98big-data=E2=80=99 clusters.= There is clearly a >> > need, as evidenced by the eager adoption of Impala and other SQL engin= es in >> > enterprise contexts, for a query engine that offers the familiar SQL >> > interface, but that has been specifically designed to operate in massi= ve, >> > distributed clusters rather than in traditional, fixed-hardware, >> > warehouse-specific deployments. Impala is one such query engine. >> > >> > We believe that the ASF is the right venue to foster an open-source >> > community around Impala=E2=80=99s development. We expect that Impala w= ill benefit >> > from more productive collaboration with related Apache projects, and u= nder >> > the auspices of the ASF will attract talented contributors who will pu= sh >> > Impala=E2=80=99s development forward at pace. >> > >> > We believe that the timing is right for Impala=E2=80=99s development t= o move >> > wholesale to the ASF: Impala is well-established, has been Apache-lice= nsed >> > open-source for more than three years, and the core project is relativ= ely >> > stable. We are excited to see where an ASF-based community can take Im= pala >> > from this strong starting point. >> > >> > =3D Initial Goals =3D >> > Our initial goals are as follows: >> > >> > * Establish ASF-compatible engineering practices and workflows >> > * Refactor and publish existing internal build scripts and test >> > infrastructure, in order to make them usable by any community member. >> > * Transfer source code, documentation and associated artifacts to the= ASF. >> > * Grow the user and developer communities >> > >> > =3D Current Status =3D >> > >> > Impala is developed as an Apache-licensed open-source project. The sou= rce >> > code is available at http://github.com/cloudera/Impala, and developer >> > documentation is at https://github.com/cloudera/Impala/wiki. The major= ity >> > of commits to the project have come from Cloudera-employed developers,= but >> > we have accepted some contributions from individuals from other >> > organizations. >> > >> > All code reviews are done via a public instance of the Gerrit review t= ool >> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing >> > list. All patches must be reviewed before they are accepted into the >> > codebase, via a voting mechanism that is similar to that used on Apach= e >> > projects such as Hadoop and HBase. >> > >> > Before a patch is committed, it must pass a suite of pre-commit tests. >> > These tests are currently run on Cloudera=E2=80=99s internal infrastru= cture. One of >> > our initial goals will be to work with the ASF Infrastructure team to = find >> > a way to run these tests in an acceptable way on publicly accessible >> > machines. >> > >> > Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMP= ALA, >> > in a way that is extremely similar to existing practices at other ASF >> > projects. >> > >> > =3D Meritocracy =3D >> > >> > We understand the central importance of meritocracy to the Apache Way.= We >> > will work to establish a welcoming, fair and meritocratic community, i= n >> > part by expanding the set of committers on the project. Although Impal= a=E2=80=99s >> > committer list will initially be dominated by members of the Impala >> > engineering team at Cloudera, we look forward to growing a rich user a= nd >> > developer community. >> > >> > =3D Community =3D >> > Impala has a strong user community (see >> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), a= nd a >> > growing developer community (see >> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We= wish >> > to attract more developers to the project, and we believe that the ASF= =E2=80=99s >> > open and meritocratic philosophy will help us with this. We note the >> > success of other, similar projects already part of the ASF. >> > >> > =3D Core Developers =3D >> > Most - but not all - of Impala=E2=80=99s core developers are not curre= ntly >> > affiliated with the ASF, and will require new ICLAs. >> > >> > =3D Alignment =3D >> > Impala is related to several other Apache projects: >> > >> > * Data that is read by Impala is very often stored in Apache Hadoop >> > clusters powered by the HDFS filesystem. >> > * Impala can also read data stored in Apache HBase >> > * Metadata for databases, tables and so on is read by Impala from Apa= che >> > Hive. >> > * The preferred data format for HDFS-based tables is Apache Parquet, = and >> > Apache Avro is also a supported data format. >> > * Impala is closely integrated with Kudu, which is also being propose= d to >> > the Incubator. >> > * Impala uses Apache Thrift as its RPC and serialization framework of >> > choice. >> > >> > =3D Known Risks =3D >> > >> > =3D=3D Orphaned Products =3D=3D >> > Impala is used by most of Cloudera=E2=80=99s customers, and Cloudera r= emains >> > committed to developing and supporting the project. Cloudera has a str= ong >> > track record in standing behind projects that were contributed to the = ASF >> > by its employees, including Apache Flume, Apache Sqoop, and others. Ot= her >> > companies both ship and support Impala, lending credence to the idea t= hat >> > Impala is not at risk of being suddenly orphaned. >> > >> > =3D=3D Inexperience with Open Source =3D=3D >> > Although all committers on the initial list have significant experienc= e >> > with at least one open-source project - namely Impala - fewer have muc= h >> > experience with ASF-based software projects as contributors and commun= ity >> > members. However, with the guidance of our mentors, committers who do = have >> > ASF experience, and time to learn during Incubation, we are confident = that >> > the project can be run in accordance with Apache principles on an ongo= ing >> > basis. >> > >> > =3D=3D Homogeneous Developers =3D=3D >> > >> > The initial committers are employees of Cloudera. >> > >> > The project has received some contributions from developers outside of >> > Cloudera, from individuals belonging to organizations such as Intel an= d >> > Google, from hobbyists and from students using Impala to advance their >> > understanding of distributed databases. The project attracted an activ= e >> > user community as well. We hope to continue to encourage contributions= from >> > these developers and community members and grow them into committers a= fter >> > they have had time to continue their contributions. >> > >> > =3D=3D Reliance on Salaried Developers =3D=3D >> > >> > Many of Impala=E2=80=99s initial set of committers work full-time on I= mpala, and >> > are paid to do so. However, as mentioned elsewhere, we anticipate grow= th in >> > the developer community which we hope will include hobbyists and acade= mics >> > who have an interested in distributed data systems. >> > >> > =3D=3D An Excessive Fascination with the Apache Brand =3D=3D >> > Although we hope that Impala benefits from the Apache Brand, any refle= cted >> > goodwill to Cloudera as the contributing entity is not the goal of >> > establishing Impala as an Apache project. We will work with the Incuba= tor >> > PMC and the PRC to ensure that the Apache Brand is respected. >> > >> > =3D Documentation =3D >> > Impala: A Modern, Open-Source SQL Engine for Hadoop ( >> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf) >> > >> > Impala=E2=80=99s developer wiki (https://github.com/cloudera/Impala/wi= ki) >> > >> > Impala=E2=80=99s auto-generated API documentation ( >> > http://impala.io/doc/html/index.html) >> > >> > =3D Initial Source =3D >> > Impala=E2=80=99s initial source contribution will come from >> > http://github.com/cloudera/Impala/. >> > >> > =3D External Dependencies =3D >> > >> > Impala depends upon a number of third-party libraries, which we list b= elow. >> > We intend to compile a LICENSE.txt file in the very short term (see >> > https://issues.cloudera.org/browse/IMPALA-2670). >> > >> > * Google gflags (BSD) >> > * Google glog (BSD) >> > * Apache Thrift (Apache Software License v2.0) >> > * Apache Commons (Apache Software License v2.0) >> > * Apache Hadoop (Apache Software License v2.0) >> > * Apache HBase (Apache Software License v2.0) >> > * Apache Hive (Apache Software License v2.0) >> > * Boost (Boost Software License) >> > * OpenLdap (OpenLDAP Software License) >> > * rapidjson (MIT) >> > * Google RE2 (BSD-style) >> > * lz4 (BSD) >> > * snappy (BSD) >> > * cyrus-sasl (CMU License) >> > * Apache Avro (Apache Software License v2.0) >> > * Cloudera squeasel (Apache Software License v2.0) >> > * Apache htrace (Incubating) (Apache Software License v2.0) >> > * Apache Sentry (Incubating) (Apache Software License v2.0) >> > * Apache Shiro (Apache Software License v2.0) >> > * Twitter Bootstrap (Apache Software License v2.0) >> > * d3 (BSD) >> > * LLVM (BSD-like) >> > >> > Build and test dependencies: >> > >> > * ant (Apache Software License v2.0) >> > * Apache Maven (Apache Software License v2.0) >> > * cmake (BSD) >> > * clang (BSD) >> > * Google gtest (Apache Software License v2.0) >> > >> > =3D Required Resources =3D >> > >> > We request that following resources be created for the project to use: >> > >> > =3D=3D Mailing lists =3D=3D >> > >> > * private@impala.incubator.apache.org (moderated subscriptions) >> > * commits@impala.incubator.apache.org >> > * dev@impala.incubator.apache.org >> > * issues@impala.incubator.apache.org >> > * user@impala.incubator.apache.org >> > >> > =3D=3D Git repository =3D=3D >> > https://git.apache.org/impala.git >> > >> > =3D=3D JIRA instance =3D=3D >> > JIRA project IMPALA (IMPALA or IMP) >> > >> > =3D=3D Other Resources =3D=3D >> > We hope to continue using Gerrit for our code review and commit workfl= ow. >> > We are involved with discussions that the Kudu team at Cloudera have b= een >> > having with Jake Farrell to start discussions on how Gerrit can fit in= to >> > the ASF. We know that several other ASF projects or podlings are also >> > interested in Gerrit. >> > >> > If the Infrastructure team does not have the bandwidth to support gerr= it, >> > we will continue to support our own instance of gerrit for Impala, and= make >> > the necessary integrations such that commits are properly authenticate= d and >> > maintain sufficient provenance to uphold the ASF standards (e.g. via t= he >> > solution adopted by the AsterixDB podling). >> > >> > =3D Initial Committers =3D >> > >> > * Tim Armstrong >> > * Alex Behm >> > * Taras Bobrovytsky >> > * Casey Ching >> > * Martin Grund >> > * Daniel Hecht >> > * Michael Ho >> > * Matthew Jacobs >> > * Ishaan Joshi >> > * Lenni Kuff >> > * Marcel Kornacker >> > * Sailesh Mukil >> > * Henry Robinson >> > * John Russell >> > * Dimitris Tsirogiannis >> > * Skye Wanderman-Milne >> > * Juan Yu >> > >> > =3D=3D Affiliations =3D=3D >> > All: Cloudera Inc. >> > >> > =3D Sponsors =3D >> > >> > =3D=3D Champion =3D=3D >> > Tom White >> > >> > =3D=3D Nominated Mentors =3D=3D >> > * Tom White (Cloudera) >> > * Todd Lipcon (Cloudera) >> > * Carl Steinbach (LinkedIn) >> > * Brock Noland (StreamSets) >> > >> > >> > =3D Sponsoring Entity =3D >> > We ask that the Incubator PMC sponsor this proposal. > > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org