From general-return-63547-archive-asf-public=cust-asf.ponee.io@incubator.apache.org Mon Feb 26 06:43:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 74F5E18064A for ; Mon, 26 Feb 2018 06:43:07 +0100 (CET) Received: (qmail 21062 invoked by uid 500); 26 Feb 2018 05:43:01 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 21037 invoked by uid 99); 26 Feb 2018 05:43:01 -0000 Received: from ui-eu-01.ponee.io (HELO localhost) (176.9.59.70) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Feb 2018 05:43:01 +0000 MIME-Version: 1.0 References: Subject: Re: [VOTE] Accept Druid into the Apache Incubator Date: Mon, 26 Feb 2018 05:42:57 -0000 x-ponymail-sender: 5a065127b82c401f9eb29067fe472f69a9227a9a From: Chinmay Kolhatkar Content-Type: text/plain; charset="iso-8859-1" In-Reply-To: X-Mailer: LuaSocket 3.0-rc1 x-ponymail-agent: PonyMail Composer/0.2 Message-ID: To: +1 On 2018/02/24 09:02:41, Ted Dunning wrote: > +1 > > > > On Thu, Feb 22, 2018 at 11:03 AM, Julian Hyde wrote: > > > Hi all, > > > > After some discussion on the Druid proposal[1], I'd like to > > start a vote on accepting Druid into the Apache Incubator, > > per the ASF policy[2] and voting rules[3]. > > > > A vote for accepting a new Apache Incubator podling is a > > majority vote for which only Incubator PMC member votes are > > binding. Votes from other people are also welcome as an > > indication of people's enthusiasm (or lack thereof). > > > > Please do not use this VOTE thread for discussions. If > > needed, start a new thread instead. > > > > This vote will run for at least 72 hours. Please VOTE as > > follows: > > [ ] +1 Accept Druid into the Apache Incubator > > [ ] +0 Abstain > > [ ] -1 Do not accept Druid into the Apache Incubator > > because ... > > > > The proposal is listed below, but you can also access it on > > the wiki[4]. > > > > Julian > > > > [1] https://lists.apache.org/thread.html/b95f90a30b6e8587e9b108f368b07c > > 1b3e23e25ca592448d9c9f81e2@%3Cgeneral.incubator.apache.org%3E > > > > [2] https://incubator.apache.org/policy/incubation.html# > > approval_of_proposal_by_sponsor > > > > [3] http://www.apache.org/foundation/voting.html > > > > [4] https://wiki.apache.org/incubator/DruidProposal > > > > > > > > > > > > = Druid Proposal = > > > > == Abstract == > > > > Druid is a high-performance, column-oriented, distributed > > data store. > > > > == Proposal == > > > > Druid is an open source data store designed for real-time > > exploratory analytics on large data sets. Druid's key > > features are a column-oriented storage layout, a distributed > > shared-nothing architecture, and ability to generate and > > leverage indexing and caching structures. Druid is typically > > deployed in clusters of tens to hundreds of nodes, and has > > the ability to load data from Apache Kafka and Apache > > Hadoop, among other data sources. Druid offers two query > > languages: a SQL dialect (powered by Apache Calcite) and a > > JSON-over-HTTP API. > > > > Druid was originally developed to power a slice-and-dice > > analytical UI built on top of large event streams. The > > original use case for Druid targeted ingest rates of > > millions of records/sec, retention of over a year of data, > > and query latencies of sub-second to a few seconds. Many > > people can benefit from such capability, and many already > > have (see http://druid.io/druid-powered.html). In addition, > > new use cases have emerged since Druid's original > > development, such as OLAP acceleration of data warehouse > > tables and more highly concurrent applications operating > > with relatively narrower queries. > > > > == Background == > > > > Druid is a data store designed for fast analytics. It would > > typically be used in lieu of more general purpose query > > systems like Hadoop MapReduce or Spark when query latency is > > of the utmost importance. Druid is often used as a data > > store for powering GUI analytical applications. > > > > The buzzwordy description of Druid is a high-performance, > > column-oriented, distributed data store. What we mean by > > this is: > > > > * "high performance": Druid aims to provide low query > > latency and high ingest rates possible. > > * "column-oriented": Druid stores data in a column-oriented > > format, like most other systems designed for analytics. It > > can also store indexes along with the columns. > > * "distributed": Druid is deployed in clusters, typically of > > tens to hundreds of nodes. > > * "data store": Druid loads your data and stores a copy of > > it on the cluster's local disks (and may cache it in > > memory). It doesn't query your data from some other > > storage system. > > > > == Rationale == > > > > Druid is a mature, active project with a large number of > > production installations, dozens of contributors to each > > release, and multiple vendors offering professional > > support. Given Druid's strong community, its close > > integration with many other Apache projects (such as Kafka, > > Hadoop, and Calcite), and its pre-existing Apache-inspired > > governance structure, we feel that Apache is the best home > > for the project on a long-term basis. > > > > == Current Status == > > > > === Meritocracy === > > > > Since Druid was first open sourced the original developers > > have solicited contributions from others, including through > > our blog, the project mailing lists, and through accepting > > GitHub pull requests. We have an Apache-inspired governance > > structure with a PMC and committers, and our committer ranks > > include a good number of people from outside the original > > development team. > > > > === Community === > > > > The Druid core developers have sought to nurture a community > > throughout the life of the project. We use GitHub as the > > focal point for bug reports and code contributions, and the > > mailing lists for most other discussion. To try to make > > people feel welcome, we've also spelled this out on a > > "CONTRIBUTE" link from the project page: > > http://druid.io/community/. Today we have an active > > contributor base (a typical release has ~40 contributors) > > and mailing list. > > > > === Core Developers === > > > > Druid enjoys good diversity of committer affiliation. The > > most active developers over the past year are affiliated > > with four different companies: Imply, Metamarkets, Yahoo, > > and Hortonworks. Many Druid committers are also committers > > on other ASF projects as well, including Apache Airflow, > > Apache Curator, and Apache Calcite. The original developers > > of Druid remain involved in the project. > > > > === Alignment === > > > > Druid's current governance structure is Apache-inspired with > > a PMC and committers chosen by a meritocratic > > process. Additionally, Druid integrates with a number of > > other Apache projects, including Kafka, Hadoop, Hive, > > Calcite, Superset (incubating), Spark, Curator, and > > ZooKeeper. > > > > == Known Risks == > > > > === Orphaned products === > > > > The risk of Druid becoming orphaned is low, due to a diverse > > committer base that is invested in the future of the > > project. > > > > === Inexperience with Open Source === > > > > Druid's core developers have been running it as a > > community-oriented open source project for some time now, > > and many of them are committers on other open source > > projects as well, including Apache Airflow, Apache Curator, > > and Apache Calcite. > > > > === Homogenous Developers === > > > > Druid's current diversity of committer affiliation means > > that we have become accustomed to working collaboratively > > and in the open. We hope that a transition to the ASF helps > > Druid's contributor base become even more diverse. > > > > === Reliance on Salaried Developers === > > > > Druid's user base and contributor base skews heavily towards > > salaried developers. We believe this is natural since Druid > > is a technology designed to be deployed on large clusters, > > and due to this, tends to be deployed by organizations > > rather than by individuals. Nevertheless, many current Druid > > developers have continued working on the project even > > through job changes, which we take to be a good sign of > > developer commitment and personal interest. > > > > === Relationships with Other Apache Products === > > > > Druid integrates with a number of other Apache > > projects. Druid internally uses Calcite for SQL planning, > > and Curator and ZooKeeper for coordination. Druid can read > > data in Avro or Parquet format. Druid can load data from > > streams in Kafka or from files in Hadoop. Druid integrates > > with Hive as an option for SQL query acceleration. Druid > > data can be visualized by Superset (incubating). > > > > === A Excessive Fascination with the Apache Brand === > > > > Druid is a successful project with a diverse community. The > > main reason for pursuing incubation is to find a stable, > > long term home for the project with a well known governance > > philosophy. > > > > == Required Resources == > > > > === Mailing lists === > > > > We would like to migrate the existing Druid mailing lists > > from Google Groups to Apache. > > > > * druid-user@googlegroups -> users@druid.incubator.apache.org > > * druid-development@googlegroups -> dev@druid.incubator.apache.org > > > > === Source control === > > > > Druid development currently takes place on GitHub. We would > > like to continue using GitHub, if possible, in order to > > preserve the workflows the community has developed around > > GitHub pull requests. > > > > === Issue tracking === > > > > Druid currently uses GitHub issues for issue tracking. We > > would like to migrate to Apache JIRA at > > http://issues.apache.org/jira/browse/DRUID. > > > > == Documentation == > > > > Druid's documentation can be found at > > http://druid.io/docs/latest/. > > > > == Initial Source == > > > > Druid was initially open-sourced by Metamarkets in 2012 and > > has been run in a community-governed fashion since then. The > > code is currently hosted at https://github.com/druid-io/ and > > includes the following repositories: > > > > * druid (primary repository) > > * druid-console (web console for Druid) > > * druid-io.github.io (source for Druid's website at > > http://druid.io/) > > * tranquility (realtime stream push client for Druid) > > * docker-druid (Docker image for Druid) > > * pydruid (Python library) > > * RDruid (R library) > > * oss-parent (Maven POM files) > > > > == Source and Intellectual Property Submission Plan == > > > > A complete set of the open source code needs to be licensed > > from the owning organization to the Foundation. Commercial > > legal counsel for the owning organization will review the > > standard Foundation licensing paperwork and propose any > > updates as needed. This license will enable Apache to > > incubate and manage the Druid project moving forward. > > > > Other Druid paraphernalia to be transferred to Apache > > consists of: > > > > * GitHub organization at https://github.com/druid-io/ > > * Twitter account at https://twitter.com/druidio > > * "druid.io" domain name > > * "Druid" trademark assignment per Foundation standard > > paper. The trademark assignment paperwork shall be > > reviewed by the owning organization's commercial and IP > > counsel > > * CLAs - all rights in the code licensed above should > > encompass the CLAs that existed between developers and > > owning organization > > > > A copyright license to the code, trademark assignment of > > Druid, and transfer of other paraphernalia to Apache should > > be sufficient to cover all rights required by Apache to > > operate the project. > > > > == External Dependencies == > > > > External dependencies distributed with Druid currently all > > have one of the following Category A or B licenses: ASL, > > BSD, CDDL, EPL, MIT, MPL; with one exception: the optional > > Druid MySQL metadata store extension depends on MySQL > > Connector/J, which is GPL licensed. Druid currently packages > > this as a separate download; see our current presentation > > on: http://druid.io/downloads.html. As part of incubation we > > intend to determine the best strategy for handling the MySQL > > extension. > > > > == Cryptography == > > > > Not applicable. > > > > == Initial Committers == > > > > The initial committers for incubation are the current set of > > committers on Druid who have expressed interest in being > > involved in Apache incubation. Affiliations are listed > > where relevant. We may seek to add other committers during > > incubation; for example, we would want to add any current > > Druid committers who express an interest after incubation > > begins. > > > > * Charles Allen (charles@allen-net.com) (Snap) > > * David Lim (david.clarence.lim@gmail.com) (Imply) > > * Eric Tschetter (cheddar@apache.org) (Splunk) > > * Fangjin Yang (fj@imply.io) (Imply) > > * Gian Merlino (gian@apache.org) (Imply) > > * Himanshu Gupta (g.himanshu@gmail.com) (Oath) > > * Jihoon Son (jihoonson@apache.org) (Imply) > > * Jonathan Wei (jon.wei@imply.io) (Imply) > > * Maxime Beauchemin (maximebeauchemin@gmail.com) (Lyft) > > * Mohamed Slim Bouguerra (slim.bouguerra@gmail.com) (Hortonworks) > > * Nishant Bangarwa (nishant@apache.org) (Hortonworks) > > * Parag Jain (paragjain16@gmail.com) (Oath) > > * Roman Leventov (leventov.ru@gmail.com) (Metamarkets) > > * Xavier Léauté (xavier@leaute.com) (Confluent) > > > > == Sponsors == > > > > * Champion: Julian Hyde > > * Nominated mentors: Julian Hyde, P. Taylor Goetz, Jun Rao > > * Sponsoring entity: Apache Incubator > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > > For additional commands, e-mail: general-help@incubator.apache.org > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org