incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <b...@cloudera.com>
Subject Re: [VOTE] Accept Trafodion into Apache Incubator
Date Wed, 20 May 2015 16:47:53 GMT
+1 (non-binding)

On 05/20/2015 09:04 AM, Devaraj Das wrote:
> +1 (binding)
>
>
>> On May 19, 2015, at 2:28 PM, Stack <stack@duboce.net> wrote:
>>
>> Following the discussion earlier in the thread [1], I would like to call a
>> VOTE to accept Trafodion as a new Apache Incubator project.
>>
>> The proposal is available on the wiki at [2] and is also attached to this
>> mail.
>>
>> The VOTE is open for at least the next 72 hours:
>>
>> [ ] +1 accept Trafodion into the Apache Incubator
>> [ ] ±0 Abstain
>> [ ] -1 because...
>>
>> I am +1 (binding)
>>
>> Thank you,
>> St.Ack
>>
>> 1.
>> http://mail-archives.apache.org/mod_mbox/incubator-general/201505.mbox/%3CCADcMMgG4NHtmFZ519iqgZLA8Lj-E7VmaQ%3Dr8C011LuS5pR0Vkw%40mail.gmail.com%3E
>> 2.  https://wiki.apache.org/incubator/TrafodionProposal
>> <https://wiki.apache.org/incubator/TrafodionProposal#preview>
>>
>>
>>
>> Trafodion Apache Incubator Proposal
>>
>> Abstract
>>
>> Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or
>> operational workloads on Hadoop.
>>
>> Proposal
>>
>> Apache Trafodion builds on the scalability, elasticity, and flexibility of
>> Hadoop. Trafodion extends Hadoop to provide guaranteed transactional
>> integrity, enabling new kinds of big data applications to run on Hadoop. Key
>> features of Apache Trafodion include:
>>
>> * Full-functioned ANSI SQL language support
>> * JDBC/ODBC connectivity for Linux/Windows clients
>> * Distributed ACID transaction protection across multiple statements,
>> tables and rows
>> * Performance improvements for OLTP workloads with compile-time and
>> run-time optimizations
>> * Support for large data sets using a parallel-aware query optimizer
>> * ANSI SQL security and data integrity constraints including referential
>> integrity
>>
>> Hewlett-Packard Company submits this proposal to donate its Apache License,
>> Version 2.0 open source project known as Trafodion, its source code,
>> documentation, and web site content to the Apache Software Foundation in
>> order to build an open source community
>>
>> Background
>>
>> Trafodion is an open source project sponsored by HP, incubated at HP Labs
>> and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution targeting
>> big data transactional or operational workloads. HP publically announced
>> the open source project and uploaded the source code to GitHub in June 2014.
>>
>> The SQL compiler, optimizer and executor components of Trafodion have a
>> rich heritage. Under development since 1993, they were released as
>> commercial closed source software in various flavors such as HP NonStop
>> SQL/MX and HP Neoview. NonStop SQL/MX was designed for online transaction
>> processing on HP’s NonStop (formerly Tandem) fault-tolerant servers and is
>> known for its high availability, scalability, and performance. Hundreds of
>> companies and thousands of servers are running mission-critical
>> applications today on NonStop SQL/MX. In addition, much of these components
>> today are running internal to HP as the core of its Enterprise Data
>> Warehouse (EDW), managing over a PB of data.
>>
>> Starting in 2013, the software was modified to run on HBase and a new
>> distributed transaction manager was written to run as an HBase co-processor.
>>
>> Unlike most NOSQL and other SQL-on-Hadoop open source projects, Trafodion
>> provides comprehensive ANSI SQL language support including full-functioned
>> data definition (DDL), data manipulation (DML), transaction control (TCL)
>> and database utility support.
>>
>> Trafodion provides comprehensive and standard SQL data manipulation support
>> including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with
>> language options including join variants, unions, where predicates,
>> aggregations (group by and having), sort ordering, sampling, correlated and
>> nested sub-queries, cursors, and many SQL functions.
>>
>> Utilities are provided for updating table statistics used by the optimizer
>> for costing (i.e. selectivity/cardinality estimates) plan alternatives, for
>> displaying the chosen SQL execution plan, plan shaping, backup and
>> restoring the database, data loading and unloading, and a command line
>> utility for interfacing with the database engine.
>>
>> Explicit control statements are provided to allow applications to define
>> transaction boundaries and to abort transactions when warranted, including
>> BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION.
>>
>> Trafodion supports ANSI’s grant/revoke semantics to define user and role
>> privileges in terms of managing and accessing the database objects.
>>
>> Rationale
>>
>> The name “Trafodion” (the Welsh word for transactions, pronounced
>> “Tra-vod-eee-on”) was chosen specifically to emphasize the differentiation
>> that Trafodion provides in closing a critical gap in the Hadoop ecosystem.
>> Trafodion builds on the scalability, elasticity, and flexibility of Hadoop.
>> Trafodion extends Hadoop to provide guaranteed transactional integrity,
>> enabling new kinds of big data applications to run on Hadoop.
>>
>> Current Status
>>
>> HP released the Trafodion code under the Apache License, Version 2, in June
>> of 2014. Since that time, we have had one major release in January 2015 and
>> one minor release in April 2015. The focus of these releases has been in
>> getting our base functionality, including security, working on top of
>> Apache HBase, as well as improving performance, availability and
>> scalability, and integrating better with HBase.
>>
>> Meritocracy
>>
>> We want to build a diverse developer community, based on the Apache Way,
>> around Trafodion. To help developers become contributors, we have
>> documentation on the wiki about the architecture, the source tree
>> structure, and an example enhancement. We plan to publish our project
>> backlog to the community, specifically highlighting areas where developers
>> new to Trafodion may best start contributing, such as extending the
>> database functionality with User Defined Routines (UDRs) and integrating
>> with other Apache projects in the Hadoop ecosystem.
>>
>> Community
>>
>> We have already begun building a community but at this time the community
>> consists only of Trafodion developers – all HP employees – and prospective
>> users. We have participated in and hosted HBase Meetups and intend to ramp
>> up our community building efforts.
>>
>> The Trafodion project has seen interest in China, where HP has conducted
>> proof-of-concepts with multiple companies and expects to see some of its
>> first commercial deployments. To help recruit contributors and users in
>> China, members of the team are translating Trafodion wiki content into
>> Mandarin.
>>
>> Core Developers
>>
>> The core developers are very experienced in database and transaction
>> monitor technology, with many having spent more than 20 years working in
>> this space.
>>
>> Alignment
>>
>> Apache Trafodion relies on Apache HBase as its storage engine. The
>> development team has collaborated with and gained valuable advice from
>> working with the Apache HBase core developers. Apache Trafodion has
>> federation capabilities as well, and can query Trafodion tables stored in
>> HBase, native HBase tables, and Apache Hive tables.
>>
>> Known Risks
>>
>> Orphaned Products
>>
>> HP Labs and HP-IT have been incubating Trafodion development for almost two
>> years. This is part of HP’s strategy to leverage its investment in database
>> software and bring software to market as open source and is similar to HP’s
>> efforts with OpenStack. Trafodion builds on HP’s equity investment in the
>> Hadoop ecosystem and its efforts to monetize Hadoop through hardware,
>> software, and services. HP wants Trafodion to be successful, as HP will
>> offer a commercially supported distribution of Trafodion.
>>
>> Inexperience with Open Source
>>
>> We have been working with open source software in building closed source
>> software for well over two decades. To help transition to doing open source
>> development, the development team received guidance and best practices from
>> HP developers working on OpenStack open source projects, many of whom have
>> experience working on Apache and other open source projects as well. Since
>> releasing Trafodion as an open source project in June of 2014, the
>> committers and contributors have moved forward using open source
>> development processes and tools for bug tracking and design blueprints and
>> Jenkins for continuous integration. As part of the incubation process, we
>> recognize we may need to change some of our development processes/tools and
>> conduct our discussions using Apache email dlists.
>>
>> Homogenous Developers
>>
>> Since the initial development of Trafodion has been supported by HP, all of
>> the current developers are HP employees. Through the support of the Apache
>> incubation project, we aim to expand the list of developers and gain
>> contributors from related SQL-on-Hadoop projects and the Apache HBase
>> project. Trafodion developers are experienced with distributed development
>> processes, being primarily based in Palo Alto, CA; Austin, TX; and
>> Shanghai, China. Trafodion is written in C++ and Java.
>>
>> Reliance on Salaried Developers
>>
>> Currently all of the developers working on the project are paid by their
>> employer to work on the project. These developers will work on the open
>> source project as well as work on the commercially supported distribution
>> of Trafodion that HP will offer.
>>
>> Relationship with Other Apache Products
>>
>> Trafodion is built upon Apache HBase and extends it to support ACID
>> transactions with HBase co-processors for distributed transaction
>> management and recovery. Trafodion envisions future collaborations with the
>> Apache HBase project on performance optimizations, such as in the areas of
>> mixed workload support, High Availability, etc. It also provides
>> transactional support and querying from native HBase tables as well.
>>
>> Trafodion uses Apache Zookeeper to coordinate and manage the distribution
>> of connection services across the cluster for load-balancing and high
>> availability reconnection purposes in the event a Trafodion process should
>> fail.
>>
>> Trafodion also envisions working with the Apache Ambari project on enabling
>> better Trafodion manageability. While Ambari focuses on system and
>> component level performance metrics, Trafodion manageability will focus in
>> a complimentary way on database workload monitoring and performance
>> analytics with capabilities more geared towards database administrators.
>>
>> There are alternative open source projects that are providing SQL-on-Hadoop
>> capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix. These
>> are more focused on reporting and analytics across data structures
>> supported on HDFS. In comparison to all of these technologies Trafodion
>> provides a very complete implementation of ANSI SQL, one of the most
>> sophisticated optimizers for such workloads, a completely parallel data
>> flow architecture that does not materialize intermediate results unless
>> necessary, full ACID transactional support, ANSI GRANT/REVOKE security, and
>> other capabilities that would take decades to build in these products. On
>> the other hand currently Trafodion is just focused on HBase and querying
>> Hive, whereas Hive and Drill provide access to other data formats in HDFS.
>>
>> An Excessive Fascination with the Apache Brand
>>
>> We understand the reputation and value of the Apache brand, and no doubt
>> believe that it will help us attract contributors and users. Our primary
>> goal is to follow a proven, open source development and community building
>> model that will make Trafodion successful and enable better collaboration
>> with other Apache projects in the Hadoop ecosystem. We also understand the
>> rules and guidelines about the use of the Apache brand and intend to follow
>> them.
>>
>> Documentation
>>
>> Documentation and technical details on Trafodion can be found at:
>> http://www.trafodion.org/
>>
>> Initial Source
>>
>> The source is available today in a public github repository:
>> https://github.com/trafodion/trafodion.
>>
>> Source and Intellectual Property Submission Plan
>>
>> The source code has already been released under the Apache License, Version
>> 2. The manuals have been released in Adobe PDF format. As part of the
>> submission process, the source for the manuals will be converted from a
>> proprietary DocBook XML format to AsciiDoc.
>>
>> External Dependencies
>>
>> Two dependencies do not have Apache compatible licenses and will be
>> addressed as we enter incubation. One dependency is log4cpp, which is
>> licensed under the LGPL. A compatible alternative might be Apache incubator
>> project log4cxx. The other dependency is unixodbc, which is used as the
>> ODBC driver manager. We will look into how Apache Hive manages being able
>> to use this incompatible software and do similar. All other dependencies
>> have Apache compatible licenses, including Apache 2.0, MIT/X11, MIT, and
>> BSD.
>>
>> Cryptography
>>
>> Trafodion does not contain any cryptographic code. It does call
>> cryptographic libraries: OpenSSL for C++ code and Java Cryptography
>> Extension (JCE) for Java code.
>>
>> Required Resources
>>
>> Mailing Lists
>>
>> private@trafodion.incubator.apache.org
>> dev@trafodion.incubator.apache.org commits@trafodion.incubator.apache.org
>>
>> Git Repository
>>
>> https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git
>>
>> Issue Tracking
>>
>> JIRA: JIRA Trafodion (Trafodion)
>>
>>
>> Initial Committers and Affiliation
>>
>> Dave Birdsall, Hewlett-Packard Company, Dave.Birdsall<AT>hp<DOT>com
>> Matt Brown, Hewlett-Packard Company, mattbrown<AT>hp<DOT>com
>> Tharak Capirala, Hewlett-Packard Company, Tharak.Capirala<AT>hp<DOT>com
>> Alice Chen, Hewlett-Packard Company, Alice.Chen<AT>hp<DOT>com
>> John DeRoo, Hewlett-Packard Company, John.Deroo<AT>hp<DOT>com
>> Roberta Marton, Hewlett-Packard Company, Roberta.Marton<AT>hp<DOT>com
>> Amanda Moran, Hewlett-Packard Company, Amanda.Kay.Moran<AT>hp<DOT>com
>> Suresh Subbiah, Hewlett-Packard Company, Suresh.Subbiah<AT>hp<DOT>com
>> Sandyha Sundaresan, Hewlett-Packard Company,
>> Sandhya.Sundaresan<AT>hp<DOT>com
>>
>> Sponsors
>>
>> Champion
>>
>> Michael Stack, Stack<AT>apache<DOT>org
>>
>> Nominated Mentors
>>
>> Andrew Purtell apurtell<AT>apache<DOT>org
>> Devaraj Das, ddas<AT>apache<DOT>or
>> Enis Söztutar, Enis<AT>apache<DOT>org
>> Lars Hofhansl, larsh<AT>apache<DOT>org
>> Michael Stack, Stack<AT>apache<DOT>org
>> Roman Shaposhnik, rshaposhnik<AT>pivotal<DOT>io
>>
>> Sponsoring Entity
>>
>> Apache Incubator PMC
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message