incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: [VOTE] Accept Impala into the Apache Incubator
Date Thu, 26 Nov 2015 19:52:19 GMT
+1 (non-binding)

On Thu, Nov 26, 2015 at 2:50 PM, Konstantin Boudnik <cos@apache.org> wrote:
> Come to think of it a bit more, yes I am not satisfied with the outcome of
> the CTR/RTC exchange in the project.
>
> Hence changing my vote to
>  -1 [binding]
>
> On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
>> -0 [binding]
>>
>> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
>> > Hi -
>> >
>> > The [DISCUSS] thread has been quiet for a few days, so I think there's been
>> > sufficient opportunity for discussion around our proposal to bring Impala
>> > to the ASF Incubator.
>> >
>> > I'd like to call a VOTE on that proposal, which is on the wiki at
>> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
>> > below.
>> >
>> > During the discussion period, the proposal has been amended to add Brock
>> > Noland as a new mentor, to add one missed committer from the list and to
>> > correct some issues with the dependency list.
>> >
>> > Please cast your votes as follows:
>> >
>> > [] +1, accept Impala into the Incubator
>> > [] +/-0, non-counted vote to express a disposition
>> > [] -1, do not accept Impala into the Incubator (please give your reason(s))
>> >
>> > As with the concurrent Kudu vote, I propose leaving the vote open for a
>> > full seven days (to close at Tuesday, December 1st at noon PST), due to the
>> > upcoming US holiday.
>> >
>> > Thanks,
>> > Henry
>> >
>> > --------
>> >
>> > = Abstract =
>> > Impala is a high-performance C++ and Java SQL query engine for data stored
>> > in Apache Hadoop-based clusters.
>> >
>> > = Proposal =
>> >
>> > We propose to contribute the Impala codebase and associated artifacts (e.g.
>> > documentation, web-site content etc.) to the Apache Software Foundation
>> > with the intent of forming a productive, meritocratic and open community
>> > around Impala’s continued development, according to the ‘Apache Way’.
>> >
>> > Cloudera owns several trademarks regarding Impala, and proposes to transfer
>> > ownership of those trademarks in full to the ASF.
>> >
>> > = Background =
>> > Engineers at Cloudera developed Impala and released it as an
>> > Apache-licensed open-source project in Fall 2012. Impala was written as a
>> > brand-new, modern C++ SQL engine targeted from the start for data stored in
>> > Apache Hadoop clusters.
>> >
>> > Impala’s most important benefit to users is high-performance, making it
>> > extremely appropriate for common enterprise analytic and business
>> > intelligence workloads. This is achieved by a number of software
>> > techniques, including: native support for data stored in HDFS and related
>> > filesystems, just-in-time compilation and optimization of individual query
>> > plans, high-performance C++ codebase and massively-parallel distributed
>> > architecture. In benchmarks, Impala is routinely amongst the very highest
>> > performing SQL query engines.
>> >
>> > = Rationale =
>> >
>> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
>> > remains by far the most common interface for interacting with data in both
>> > traditional warehouses and modern ‘big-data’ clusters. There is clearly
a
>> > need, as evidenced by the eager adoption of Impala and other SQL engines in
>> > enterprise contexts, for a query engine that offers the familiar SQL
>> > interface, but that has been specifically designed to operate in massive,
>> > distributed clusters rather than in traditional, fixed-hardware,
>> > warehouse-specific deployments. Impala is one such query engine.
>> >
>> > We believe that the ASF is the right venue to foster an open-source
>> > community around Impala’s development. We expect that Impala will benefit
>> > from more productive collaboration with related Apache projects, and under
>> > the auspices of the ASF will attract talented contributors who will push
>> > Impala’s development forward at pace.
>> >
>> > We believe that the timing is right for Impala’s development to move
>> > wholesale to the ASF: Impala is well-established, has been Apache-licensed
>> > open-source for more than three years, and the core project is relatively
>> > stable. We are excited to see where an ASF-based community can take Impala
>> > from this strong starting point.
>> >
>> > = Initial Goals =
>> > Our initial goals are as follows:
>> >
>> >  * Establish ASF-compatible engineering practices and workflows
>> >  * Refactor and publish existing internal build scripts and test
>> > infrastructure, in order to make them usable by any community member.
>> >  * Transfer source code, documentation and associated artifacts to the ASF.
>> >  * Grow the user and developer communities
>> >
>> > = Current Status =
>> >
>> > Impala is developed as an Apache-licensed open-source project. The source
>> > code is available at http://github.com/cloudera/Impala, and developer
>> > documentation is at https://github.com/cloudera/Impala/wiki. The majority
>> > of commits to the project have come from Cloudera-employed developers, but
>> > we have accepted some contributions from individuals from other
>> > organizations.
>> >
>> > All code reviews are done via a public instance of the Gerrit review tool
>> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
>> > list. All patches must be reviewed before they are accepted into the
>> > codebase, via a voting mechanism that is similar to that used on Apache
>> > projects such as Hadoop and HBase.
>> >
>> > Before a patch is committed, it must pass a suite of pre-commit tests.
>> > These tests are currently run on Cloudera’s internal infrastructure. One of
>> > our initial goals will be to work with the ASF Infrastructure team to find
>> > a way to run these tests in an acceptable way on publicly accessible
>> > machines.
>> >
>> > Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
>> > in a way that is extremely similar to existing practices at other ASF
>> > projects.
>> >
>> > = Meritocracy =
>> >
>> > We understand the central importance of meritocracy to the Apache Way. We
>> > will work to establish a welcoming, fair and meritocratic community, in
>> > part by expanding the set of committers on the project. Although Impala’s
>> > committer list will initially be dominated by members of the Impala
>> > engineering team at Cloudera, we look forward to growing a rich user and
>> > developer community.
>> >
>> > = Community =
>> > Impala has a strong user community (see
>> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
>> > growing developer community (see
>> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
>> > to attract more developers to the project, and we believe that the ASF’s
>> > open and meritocratic philosophy will help us with this. We note the
>> > success of other, similar projects already part of the ASF.
>> >
>> > = Core Developers =
>> > Most - but not all - of Impala’s core developers are not currently
>> > affiliated with the ASF, and will require new ICLAs.
>> >
>> > = Alignment =
>> > Impala is related to several other Apache projects:
>> >
>> >  * Data that is read by Impala is very often stored in Apache Hadoop
>> > clusters powered by the HDFS filesystem.
>> >  * Impala can also read data stored in Apache HBase
>> >  * Metadata for databases, tables and so on is read by Impala from Apache
>> > Hive.
>> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
>> > Apache Avro is also a supported data format.
>> >  * Impala is closely integrated with Kudu, which is also being proposed to
>> > the Incubator.
>> >  * Impala uses Apache Thrift as its RPC and serialization framework of
>> > choice.
>> >
>> > = Known Risks =
>> >
>> > == Orphaned Products ==
>> > Impala is used by most of Cloudera’s customers, and Cloudera remains
>> > committed to developing and supporting the project. Cloudera has a strong
>> > track record in standing behind projects that were contributed to the ASF
>> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
>> > companies both ship and support Impala, lending credence to the idea that
>> > Impala is not at risk of being suddenly orphaned.
>> >
>> > == Inexperience with Open Source ==
>> > Although all committers on the initial list have significant experience
>> > with at least one open-source project - namely Impala - fewer have much
>> > experience with ASF-based software projects as contributors and community
>> > members. However, with the guidance of our mentors, committers who do have
>> > ASF experience, and time to learn during Incubation, we are confident that
>> > the project can be run in accordance with Apache principles on an ongoing
>> > basis.
>> >
>> > == Homogeneous Developers ==
>> >
>> > The initial committers are employees of Cloudera.
>> >
>> > The project has received some contributions from developers outside of
>> > Cloudera, from individuals belonging to organizations such as Intel and
>> > Google, from hobbyists and from students using Impala to advance their
>> > understanding of distributed databases. The project attracted an active
>> > user community as well. We hope to continue to encourage contributions from
>> > these developers and community members and grow them into committers after
>> > they have had time to continue their contributions.
>> >
>> > == Reliance on Salaried Developers ==
>> >
>> > Many of Impala’s initial set of committers work full-time on Impala, and
>> > are paid to do so. However, as mentioned elsewhere, we anticipate growth in
>> > the developer community which we hope will include hobbyists and academics
>> > who have an interested in distributed data systems.
>> >
>> > == An Excessive Fascination with the Apache Brand ==
>> > Although we hope that Impala benefits from the Apache Brand, any reflected
>> > goodwill to Cloudera as the contributing entity is not the goal of
>> > establishing Impala as an Apache project. We will work with the Incubator
>> > PMC and the PRC to ensure that the Apache Brand is respected.
>> >
>> > = Documentation =
>> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
>> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>> >
>> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>> >
>> > Impala’s auto-generated API documentation (
>> > http://impala.io/doc/html/index.html)
>> >
>> > = Initial Source =
>> > Impala’s initial source contribution will come from
>> > http://github.com/cloudera/Impala/.
>> >
>> > = External Dependencies =
>> >
>> > Impala depends upon a number of third-party libraries, which we list below.
>> > We intend to compile a LICENSE.txt file in the very short term (see
>> > https://issues.cloudera.org/browse/IMPALA-2670).
>> >
>> >  * Google gflags (BSD)
>> >  * Google glog (BSD)
>> >  * Apache Thrift (Apache Software License v2.0)
>> >  * Apache Commons (Apache Software License v2.0)
>> >  * Apache Hadoop (Apache Software License v2.0)
>> >  * Apache HBase (Apache Software License v2.0)
>> >  * Apache Hive (Apache Software License v2.0)
>> >  * Boost (Boost Software License)
>> >  * OpenLdap (OpenLDAP Software License)
>> >  * rapidjson (MIT)
>> >  * Google RE2 (BSD-style)
>> >  * lz4 (BSD)
>> >  * snappy (BSD)
>> >  * cyrus-sasl (CMU License)
>> >  * Apache Avro (Apache Software License v2.0)
>> >  * Cloudera squeasel (Apache Software License v2.0)
>> >  * Apache htrace (Incubating) (Apache Software License v2.0)
>> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
>> >  * Apache Shiro (Apache Software License v2.0)
>> >  * Twitter Bootstrap (Apache Software License v2.0)
>> >  * d3 (BSD)
>> >  * LLVM (BSD-like)
>> >
>> > Build and test dependencies:
>> >
>> >  * ant (Apache Software License v2.0)
>> >  * Apache Maven (Apache Software License v2.0)
>> >  * cmake (BSD)
>> >  * clang (BSD)
>> >  * Google gtest (Apache Software License v2.0)
>> >
>> > = Required Resources =
>> >
>> > We request that following resources be created for the project to use:
>> >
>> > == Mailing lists ==
>> >
>> >  * private@impala.incubator.apache.org (moderated subscriptions)
>> >  * commits@impala.incubator.apache.org
>> >  * dev@impala.incubator.apache.org
>> >  * issues@impala.incubator.apache.org
>> >  * user@impala.incubator.apache.org
>> >
>> > == Git repository ==
>> > https://git.apache.org/impala.git
>> >
>> > == JIRA instance ==
>> > JIRA project IMPALA (IMPALA or IMP)
>> >
>> > == Other Resources ==
>> > We hope to continue using Gerrit for our code review and commit workflow.
>> > We are involved with discussions that the Kudu team at Cloudera have been
>> > having with Jake Farrell to start discussions on how Gerrit can fit into
>> > the ASF. We know that several other ASF projects or podlings are also
>> > interested in Gerrit.
>> >
>> > If the Infrastructure team does not have the bandwidth to support gerrit,
>> > we will continue to support our own instance of gerrit for Impala, and make
>> > the necessary integrations such that commits are properly authenticated and
>> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
>> > solution adopted by the AsterixDB podling).
>> >
>> > = Initial Committers =
>> >
>> >  * Tim Armstrong
>> >  * Alex Behm
>> >  * Taras Bobrovytsky
>> >  * Casey Ching
>> >  * Martin Grund
>> >  * Daniel Hecht
>> >  * Michael Ho
>> >  * Matthew Jacobs
>> >  * Ishaan Joshi
>> >  * Lenni Kuff
>> >  * Marcel Kornacker
>> >  * Sailesh Mukil
>> >  * Henry Robinson
>> >  * John Russell
>> >  * Dimitris Tsirogiannis
>> >  * Skye Wanderman-Milne
>> >  * Juan Yu
>> >
>> > == Affiliations ==
>> > All: Cloudera Inc.
>> >
>> > = Sponsors =
>> >
>> > == Champion ==
>> > Tom White
>> >
>> > == Nominated Mentors ==
>> >  * Tom White (Cloudera)
>> >  * Todd Lipcon (Cloudera)
>> >  * Carl Steinbach (LinkedIn)
>> >  * Brock Noland (StreamSets)
>> >
>> >
>> > = Sponsoring Entity =
>> > We ask that the Incubator PMC sponsor this proposal.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message