incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Reilly <prei...@php.net>
Subject Re: [VOTE] Phoenix for incubator project
Date Fri, 06 Dec 2013 00:24:47 GMT
+1 (non binding)

— Patrick

On Thu, Dec 5, 2013 at 3:52 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> +1 (binding)
>
>
>
> On Thu, Dec 5, 2013 at 3:42 PM, Anil Gupta <anilgupta84@gmail.com> wrote:
>
>> +1
>>
>> Hope to see Phoenix in Apache family.
>>
>> Sent from my iPhone
>>
>> > On Dec 5, 2013, at 1:43 PM, Stack <stack@duboce.net> wrote:
>> >
>> > Discussion of the Phoenix proposal has settled since its original
>> > posting on November 7th.  Feedback has been incorporated.
>> >
>> > Let us now move to a vote.
>> >
>> > Should Phoenix become an Apache incubator project?
>> >
>> > [] +1 Accept Phoenix into the Incubator
>> > [] +0 Don't care whether or which
>> > [] -1 Do not accept Phoenix into the Incubator because...
>> >
>> > The latest version of the proposal can be found here [1].  It is
>> > also posted below for your convenience.
>> >
>> > Let the vote run 72 hours.
>> >
>> > Thank you,
>> > St.Ack
>> >
>> > 1. https://wiki.apache.org/incubator/PhoenixProposal
>> >
>> >
>> >
>> >
>> > Abstract
>> >
>> > Phoenix is an open source SQL query engine for Apache HBase, a NoSQL data
>> > store. It is accessed as a JDBC driver and enables querying and managing
>> > HBase tables using SQL.
>> >
>> > Proposal
>> >
>> > Phoenix is an open source SQL skin over HBase delivered as a
>> > client-embedded JDBC driver targeting low latency queries over HBase
>> data.
>> > Phoenix takes your SQL query, compiles it into a series of HBase scans,
>> and
>> > orchestrates the running of those scans to produce regular JDBC result
>> > sets. The table metadata is stored in an HBase table and versioned, such
>> > that snapshot queries over prior versions will automatically use the
>> > correct schema. Direct use of the HBase API, along with coprocessors and
>> > custom filters, results in performance on the order of milliseconds for
>> > small queries, or seconds for tens of millions of rows. Phoenix
>> interfaces
>> > with both Pig and Map-reduce for the input and output of data.
>> >
>> > Background
>> >
>> > Phoenix initially started as an internal project at Salesforce.com to
>> > efficiently analyze big data stored in HBase. It was open sourced on
>> Github
>> > about a year ago in Jan 2013. Over time Phoenix, together with HBase as
>> the
>> > storage tier, has begun to evolve into a general SQL database with
>> support
>> > for metadata management, secondary indexes, joins, query optimization,
>> and
>> > multi-tenancy. This is expected to continue as Phoenix implements a
>> > cost-based query optimizer and potentially transaction support, and
>> > surfaces new HBase security features such as encryption and cell-level
>> > security. Phoenix's developer community has also grown to include
>> > additional companies such as Intel, who have contributed join support to
>> > Phoenix, as well as Hortonworks, who are in the process of porting
>> Phoenix
>> > to the 0.96 release of HBase.
>> >
>> > Rationale
>> >
>> > As usage and the number of contributors to Phoenix has grown, we have
>> > sought for a long-term home for the project, and we believe the Apache
>> > foundation would be a great fit. Joining Apache would ensure that tried
>> and
>> > true processes and procedures are in place for the growing number of
>> > organizations interested in contributing to Phoenix. Phoenix is also a
>> good
>> > fit for the Apache foundation: Phoenix already interoperates with several
>> > existing Apache projects (HBase, Hadoop, Pig, BigTop). The Phoenix team
>> is
>> > familiar with the Apache process and and believes in the Apache mission -
>> > the team already includes multiple Apache committers.
>> >
>> > Initial Goals
>> >
>> > The initial goals will be to move the existing codebase to Apache and
>> > integrate with the Apache development process. Once this is accomplished,
>> > we plan for incremental development and releases that follow the Apache
>> > guidelines.
>> >
>> > Current Status
>> >
>> > Phoenix has undergone two major and three minor releases (1.0, 1.1, 1.2,
>> > 2.0, and 2.1) as well as many patch releases. Phoenix is being used in
>> > production by Salesforce.com as well as at other organizations. The
>> Phoenix
>> > codebase is currently hosted at github.com, which will form the basis of
>> > the Apache git repository.
>> >
>> > Meritocracy
>> >
>> > The Phoenix project already operates on meritocratic principles. Phoenix
>> > has several developers from various organizations outside of
>> Salesforce.com
>> > who have contributed major new features. While this process has remained
>> > mostly informal, as we do not have an official committer list, an
>> implicit
>> > organization exists in which individuals who contribute major components
>> > act as maintainers for those modules. If accepted, the Phoenix project
>> > would include several of these participants as initial committers. We
>> will
>> > work to identify all committers and PPMC members for the project and to
>> > operate under the ASF meritocratic principles.
>> >
>> > Community
>> >
>> > Acceptance into the Apache foundation would bolster the already strong
>> user
>> > and developer community around Phoenix. That community includes many
>> > contributors from various other companies, and an active mailing list
>> > composed of hundreds of users.
>> >
>> > Core Developers
>> >
>> > The core developers of our project are listed in our contributors and
>> > initial PPMC below. Though many are employed at Salesforce.com, there is
>> a
>> > representative cross sampling of other organizations including Intel,
>> > Hortonworks, and Cloudera.
>> >
>> > Alignment
>> >
>> > Our proposed Phoenix effort aligns closely with Apache HBase. The HBase
>> > project perimeter is denoted by a simple byte-array based Create, Read,
>> > Update, Delete and Scan APIs with no current plans to extend beyond this
>> > bounds. Phoenix complements this with a higher level API in SQL with
>> which
>> > many are already familiar. At first glance, it may seem that Phoenix
>> should
>> > just be folded into HBase as a new module. However, the focus of the two
>> > projects will be quite different, especially as Phoenix matures. With
>> > secondary indexing and joins just having been introduced into Phoenix,
>> the
>> > next big frontier will be to implement a cost-based query optimizer. This
>> > is the heart-and-soul of most relational databases and can can take a
>> > lifetime to get right.
>> >
>> > HBase is focused on being a scalable data store agnostic to types and
>> > schema. Phoenix would layer typing, and relational facilities on top of
>> > this scalable store. By keeping Apache HBase and Phoenix separate, both
>> may
>> > evolve independently and at different rates. Though the focus of the two
>> > projects is different, the relationship between them is very positive and
>> > mutually beneficial. New features in HBase will be leveraged in Phoenix
>> as
>> > it makes sense to surface these in a SQL paradigm. In addition, Phoenix
>> may
>> > drive new features in HBase, as evidenced by the new type system recently
>> > introduced into HBase. This will enable better interoperability between
>> > Apache Hive, standalone HBase uses case, and Phoenix by defining a
>> standard
>> > serialization format.
>> >
>> > Phoenix can be divided into a front end and a back end. The front end is
>> > delivered as a JDBC driver and contains, among other things, the SQL
>> parser
>> > and query planner. The front end is currently written for the HBase
>> client
>> > API but could be extended to support other data stores in the Apache
>> family.
>> >
>> > The back end is, currently, HBase specific components for pushing as much
>> > work to the server as possible. However, if there were sufficient
>> interest
>> > to build them, contributions to Phoenix of new back ends for other data
>> > stores in the Apache family would be feasible.
>> >
>> > Other projects exists that perform SQL over HBase data (such as Apache
>> > Hive), however these products do not provide the same low latency query
>> > capabilities as Phoenix. Instead, they are more oriented around
>> maximizing
>> > throughput for batched operations. Phoenix opens the door to a completely
>> > new set of use cases for Apache HBase that demand a more interactive user
>> > experience.
>> >
>> > There are also a number of related Apache projects and dependencies that
>> > are mentioned in the Relationships with Other Apache products section.
>> >
>> > Known Risks
>> >
>> > Orphaned Products
>> >
>> > Given the current level of investment in Phoenix - the risk of the
>> project
>> > being abandoned is minimal. All current and planned HBase use cases at
>> > Salesforce.com go through Phoenix. In addition, both Intel and
>> Hortonworks
>> > plan to include Phoenix in their distributions. Other companies have
>> > devoted significant internal infrastructure investment in Phoenix.
>> >
>> > Inexperience with Open Source
>> >
>> > Phoenix has existed as a healthy open source project for almost a year.
>> > During that time, James, Mujtaba, and others have successfully fostered
>> an
>> > open-source community, attracting users and developers from a diverse
>> group
>> > of companies including Intel, Intuit, Bloomberg, Tagged, and Hortonworks.
>> > Although neither are committers on other Apache projects, both James and
>> > Mujtaba have experience working with and contributing to other Apache
>> > projects.
>> >
>> > Homogenous Developers
>> >
>> > The initial list of committers includes developers from several
>> > institutions, including Salesforce, Intel, and Hortonworks.
>> >
>> > Reliance on Salaried Developers
>> >
>> > Like most open source projects, Phoenix receives substantial support from
>> > salaried developers. A large fraction of Phoenix development is supported
>> > by Salesforce.com. In addition, those working from within corporations
>> and
>> > universities often devote “after hours” or spare time to the project. We
>> > will continue our efforts to ensure stewardship of the project to be
>> > independent of salaried developers.
>> >
>> > Relationship with Other Apache Products
>> >
>> > Although Phoenix provides a higher level abstraction than Apache HBase by
>> > hiding its client APIs, Phoenix relies on Apache HBase for both storing
>> and
>> > retrieving data. It also inter-operates with Apache HBase by allowing
>> > existing data, not created by Phoenix, to be queried. In addition, both
>> > Apache Pig and Hadoop are supported for data input and output. Finally,
>> the
>> > Phoenix is included and installable through Apache Bigtop and the build
>> and
>> > test suite are run through Apache Maven.
>> >
>> > Phoenix offers an alternative query engine to Apache Hadoop (MapReduce).
>> > Unlike MapReduce, Phoenix is designed for lower-latency, OLTP, and
>> > interactive workloads. This makes the projects complimentary as users may
>> > run MapReduce and Phoenix side-by-side.
>> >
>> > We plan to increase the interoperability between Phoenix, Apache Hive,
>> and
>> > standalone Apache HBase usage by standardizing on a new type system that
>> > has been introduced in the current major release of HBase. By all these
>> > products adopting this new serialization format, interoperability between
>> > them will take a big step forward.
>> >
>> > In addition, we plan to explore providing lower level APIs for other
>> > products such as Apache Drill to plug into when querying HBase data so
>> that
>> > they get the performance benefits of using Phoenix.
>> >
>> > A Excessive Fascination with the Apache Brand
>> >
>> > Phoenix is already a healthy and relatively well known open source
>> project.
>> > This proposal is not for the purpose of generating publicity. Rather, the
>> > primary benefits to joining Apache are those outlined in the Rationale
>> > section.
>> >
>> > Documentation
>> >
>> > Additional documentation on Phoenix may be found on its github website:
>> >
>> > Phoenix overview:
>> > https://github.com/forcedotcom/phoenix/blob/master/README.md
>> >
>> > Phoenix wiki: https://github.com/forcedotcom/phoenix/wiki
>> >
>> > Phoenix road map: https://github.com/forcedotcom/phoenix/wiki#roadmap
>> >
>> > Phoenix issue tracking:
>> >
>> https://github.com/forcedotcom/phoenix/issues?direction=desc&sort=updated&state=open
>> >
>> > Phoenix codebase: https://github.com/forcedotcom/phoenix
>> >
>> > Phoenix SQL language reference: http://forcedotcom.github.io/phoenix/
>> >
>> > Phoenix performance:
>> >
>> https://github.com/forcedotcom/phoenix/wiki/Performance#phoenix-vs-related-products
>> >
>> > User group: https://groups.google.com/group/phoenix-hbase-user
>> >
>> > Initial Source
>> >
>> > The Phoenix codebase is currently hosted on Github:
>> > https://github.com/forcedotcom/phoenix.
>> >
>> > Source and Intellectual Property Submission Plan
>> >
>> > Currently, the Phoenix codebase is distributed under a BSD license. Upon
>> > entering Apache, the Phoenix license will be migrated to the Apache 2.0
>> > License.
>> >
>> > External Dependencies
>> >
>> > Beyond relying on Apache HBase, Phoenix has the following external
>> > dependencies:
>> >
>> > ANTLR 3.5 (BSD license: http://www.antlr3.org/license.html)
>> >
>> > Sqlline 1.1.2 (BSD license:
>> > https://github.com/julianhyde/sqlline/blob/master/LICENSE)
>> >
>> > Open CSV 2.3 (Apache 2.0 license)
>> >
>> > Upon acceptance to the incubator, we would begin a thorough analysis of
>> all
>> > transitive dependencies to verify this information and introduce license
>> > checking into the build and release process by integrating with Apache
>> Rat.
>> >
>> > Required Resources
>> >
>> > Mailing list
>> >
>> > We will migrate the existing Phoenix mailing lists as follows:
>> >
>> > phoenix-hbase-user@googlegroups.com -->
>> users@phoenix.incubator.apache.org
>> >
>> > phoenix-hbase-dev@googlegroups.com --> dev@phoenix.incubator.apache.org
>> >
>> > private@phoenix.incubator.apache.org for IPMC members
>> >
>> > commits@phoenix.incubator.apache.org
>> >
>> > The latter is to be consistent with the new PIAO naming scheme for
>> podlings.
>> >
>> > Source control
>> >
>> > The Phoenix team would like to use Git for source control, due to our
>> > current use of Git. We request a writeable Git repo for Phoenix, and
>> > mirroring to be set up to Github through INFRA.
>> >
>> > Issue Tracking
>> >
>> > Phoenix currently uses the github issue tracking system associated with
>> its
>> > github repo:
>> >
>> https://github.com/forcedotcom/phoenix/issues?direction=desc&sort=updated&state=open
>> .
>> > We will migrate to the Apache JIRA:
>> > http://issues.apache.org/jira/browse/PHOENIX
>> >
>> > Other Resources
>> >
>> > Jenkins/Hudson for builds and test running.
>> > Wiki for documentation purposes
>> > Blog to improve project dissemination
>> >
>> > Initial Committers
>> >
>> > James Taylor <jtaylor at salesforce dot com>
>> >
>> > Mujtaba Chohan <mchohan at salesforce dot com>
>> >
>> > Jesse Yates <jyates at apache dot org>
>> >
>> > Eli Levine <elevine at salesforce dot com>
>> >
>> > Simon Toens <stoens at salesforce dot com>
>> >
>> > Maryann Xue <wei.xue at intel dot com>
>> >
>> > Anoop Sam John <anoopsamjohn at apache dot org>
>> >
>> > Ramkrishna S Vasudevan <ramkrishna at apache dot org>
>> >
>> > Jeffrey Zhong <jeffreyz at apache dot org>
>> >
>> > Nick Dimiduk <ndimiduk at apache dot org>
>> >
>> > Affiliations
>> >
>> > The initial committers are from three organizations: Salesforce.com,
>> Intel,
>> > and Hortonworks.
>> >
>> > James Taylor (Salesforce.com)
>> > Mujtaba Chohan (Salesforce.com)
>> > Jesse Yates (Salesforce.com)
>> > Eli Levine (Salesforce.com)
>> > Simon Toens (Salesforce.com)
>> > Maryann Xue (Intel)
>> > Anoop Sam John (Intel)
>> > Ramkrishna S Vasudevan (Intel)
>> > Jeffrey Zhong (Hortonworks)
>> > Nick Dimiduk (Hortonworks)
>> >
>> > Sponsors
>> >
>> > Champion
>> >
>> > Michael Stack
>> >
>> > Nominated Mentors
>> >
>> > Michael Stack
>> > Lars Hofhansl
>> > Andrew Purtell
>> > Devaraj Das
>> > Enis Soztutar
>> > Steven Noels
>> >
>> > Sponsoring Entity
>> >
>> > The Apache Incubator
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message