incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [DISCUSS] Trafodion Incubation Proposal
Date Mon, 11 May 2015 03:06:32 GMT
On Sun, May 10, 2015 at 7:13 PM, Konstantin Boudnik <cos@apache.org> wrote:

> I think it'd be great to have SQL platform for Hadoop
>
> +1
>
> I am mentoring 4 projects at the moment, but if you need a 1/2 time mentor
> -
> count me in ;)
>
> Cos
>
>
We'll take you up on your kind offer if we can't get someone less loaded.

Thanks Cos,

St.Ack


> On Fri, May 08, 2015 at 02:59PM, Stack wrote:
> > I would like to start up a discussion on Trafodion joining the ASF as an
> > incubating project.
> >
> > Trafodion is a webscale SQL-on-Hadoop solution that enables transactional
> > or operational workloads on Hadoop, .
> >
> > The proposal is available on the wiki here:
> > https://wiki.apache.org/incubator/TrafodionProposal#preview
> >
> > The proposal text is also attached to the end of this email.
> >
> > Trafodion is a rich, storied SQL engine that has recently been ported to
> > run on HBase and Hadoop. I think it would make for a fine addition to the
> > Apache family of projects  It would be good to hear what others think.
> >
> > Thank you in advance for giving the proposal a read.
> >
> > Yours,
> > St.Ack
> >
> >
> > Trafodion Apache Incubator Proposal
> >
> > Abstract
> >
> > Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or
> > operational workloads on Hadoop.
> >
> > Proposal
> >
> > Apache Trafodion builds on the scalability, elasticity, and flexibility
> of
> > Hadoop. Trafodion extends Hadoop to provide guaranteed transactional
> > integrity, enabling new kinds of big data applications to run on Hadoop.
> Key
> > features of Apache Trafodion include:
> >
> > * Full-functioned ANSI SQL language support
> > * JDBC/ODBC connectivity for Linux/Windows clients
> > * Distributed ACID transaction protection across multiple statements,
> > tables and rows
> > * Performance improvements for OLTP workloads with compile-time and
> > run-time optimizations
> > * Support for large data sets using a parallel-aware query optimizer
> > * ANSI SQL security and data integrity constraints including referential
> > integrity
> >
> > Hewlett-Packard Company submits this proposal to donate its Apache
> License,
> > Version 2.0 open source project known as Trafodion, its source code,
> > documentation, and web site content to the Apache Software Foundation in
> > order to build an open source community
> >
> > Background
> >
> > Trafodion is an open source project sponsored by HP, incubated at HP Labs
> > and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution
> targeting
> > big data transactional or operational workloads. HP publically announced
> > the open source project and uploaded the source code to GitHub in June
> 2014.
> >
> > The SQL compiler, optimizer and executor components of Trafodion have a
> > rich heritage. Under development since 1993, they were released as
> > commercial closed source software in various flavors such as HP NonStop
> > SQL/MX and HP Neoview. NonStop SQL/MX was designed for online transaction
> > processing on HP’s NonStop (formerly Tandem) fault-tolerant servers and
> is
> > known for its high availability, scalability, and performance. Hundreds
> of
> > companies and thousands of servers are running mission-critical
> > applications today on NonStop SQL/MX. In addition, much of these
> components
> > today are running internal to HP as the core of its Enterprise Data
> > Warehouse (EDW), managing over a PB of data.
> >
> > Starting in 2013, the software was modified to run on HBase and a new
> > distributed transaction manager was written to run as an HBase
> co-processor.
> >
> > Unlike most NOSQL and other SQL-on-Hadoop open source projects, Trafodion
> > provides comprehensive ANSI SQL language support including
> full-functioned
> > data definition (DDL), data manipulation (DML), transaction control (TCL)
> > and database utility support.
> >
> > Trafodion provides comprehensive and standard SQL data manipulation
> support
> > including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with
> > language options including join variants, unions, where predicates,
> > aggregations (group by and having), sort ordering, sampling, correlated
> and
> > nested sub-queries, cursors, and many SQL functions.
> >
> > Utilities are provided for updating table statistics used by the
> optimizer
> > for costing (i.e. selectivity/cardinality estimates) plan alternatives,
> for
> > displaying the chosen SQL execution plan, plan shaping, backup and
> > restoring the database, data loading and unloading, and a command line
> > utility for interfacing with the database engine.
> >
> > Explicit control statements are provided to allow applications to define
> > transaction boundaries and to abort transactions when warranted,
> including
> > BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION.
> >
> > Trafodion supports ANSI’s grant/revoke semantics to define user and role
> > privileges in terms of managing and accessing the database objects.
> >
> > Rationale
> >
> > The name “Trafodion” (the Welsh word for transactions, pronounced
> > “Tra-vod-eee-on”) was chosen specifically to emphasize the
> differentiation
> > that Trafodion provides in closing a critical gap in the Hadoop
> ecosystem.
> > Trafodion builds on the scalability, elasticity, and flexibility of
> Hadoop.
> > Trafodion extends Hadoop to provide guaranteed transactional integrity,
> > enabling new kinds of big data applications to run on Hadoop.
> >
> > Current Status
> >
> > HP released the Trafodion code under the Apache License, Version 2, in
> June
> > of 2014. Since that time, we have had one major release in January 2015
> and
> > one minor release in April 2015. The focus of these releases has been in
> > getting our base functionality, including security, working on top of
> > Apache HBase, as well as improving performance, availability and
> > scalability, and integrating better with HBase.
> >
> > Meritocracy
> >
> > We want to build a diverse developer community, based on the Apache Way,
> > around Trafodion. To help developers become contributors, we have
> > documentation on the wiki about the architecture, the source tree
> > structure, and an example enhancement. We plan to publish our project
> > backlog to the community, specifically highlighting areas where
> developers
> > new to Trafodion may best start contributing, such as extending the
> > database functionality with User Defined Routines (UDRs) and integrating
> > with other Apache projects in the Hadoop ecosystem.
> >
> > Community
> >
> > We have already begun building a community but at this time the community
> > consists only of Trafodion developers – all HP employees – and
> prospective
> > users. We have participated in and hosted HBase Meetups and intend to
> ramp
> > up our community building efforts.
> >
> > The Trafodion project has seen interest in China, where HP has conducted
> > proof-of-concepts with multiple companies and expects to see some of its
> > first commercial deployments. To help recruit contributors and users in
> > China, members of the team are translating Trafodion wiki content into
> > Mandarin.
> >
> > Core Developers
> >
> > The core developers are very experienced in database and transaction
> > monitor technology, with many having spent more than 20 years working in
> > this space.
> >
> > Alignment
> >
> > Apache Trafodion relies on Apache HBase as its storage engine. The
> > development team has collaborated with and gained valuable advice from
> > working with the Apache HBase core developers. Apache Trafodion has
> > federation capabilities as well, and can query Trafodion tables stored in
> > HBase, native HBase tables, and Apache Hive tables.
> >
> > Known Risks
> >
> > Orphaned Products
> >
> > HP Labs and HP-IT have been incubating Trafodion development for almost
> two
> > years. This is part of HP’s strategy to leverage its investment in
> database
> > software and bring software to market as open source and is similar to
> HP’s
> > efforts with OpenStack. Trafodion builds on HP’s equity investment in the
> > Hadoop ecosystem and its efforts to monetize Hadoop through hardware,
> > software, and services. HP wants Trafodion to be successful, as HP will
> > offer a commercially supported distribution of Trafodion.
> >
> > Inexperience with Open Source
> >
> > We have been working with open source software in building closed source
> > software for well over two decades. To help transition to doing open
> source
> > development, the development team received guidance and best practices
> from
> > HP developers working on OpenStack open source projects, many of whom
> have
> > experience working on Apache and other open source projects as well.
> Since
> > releasing Trafodion as an open source project in June of 2014, the
> > committers and contributors have moved forward using open source
> > development processes and tools for bug tracking and design blueprints
> and
> > Jenkins for continuous integration. As part of the incubation process, we
> > recognize we may need to change some of our development processes/tools
> and
> > conduct our discussions using Apache email dlists.
> >
> > Homogenous Developers
> >
> > Since the initial development of Trafodion has been supported by HP, all
> of
> > the current developers are HP employees. Through the support of the
> Apache
> > incubation project, we aim to expand the list of developers and gain
> > contributors from related SQL-on-Hadoop projects and the Apache HBase
> > project. Trafodion developers are experienced with distributed
> development
> > processes, being primarily based in Palo Alto, CA; Austin, TX; and
> > Shanghai, China. Trafodion is written in C++ and Java.
> >
> > Reliance on Salaried Developers
> >
> > Currently all of the developers working on the project are paid by their
> > employer to work on the project. These developers will work on the open
> > source project as well as work on the commercially supported distribution
> > of Trafodion that HP will offer.
> >
> > Relationship with Other Apache Products
> >
> > Trafodion is built upon Apache HBase and extends it to support ACID
> > transactions with HBase co-processors for distributed transaction
> > management and recovery. Trafodion envisions future collaborations with
> the
> > Apache HBase project on performance optimizations, such as in the areas
> of
> > mixed workload support, High Availability, etc. It also provides
> > transactional support and querying from native HBase tables as well.
> >
> > Trafodion uses Apache Zookeeper to coordinate and manage the distribution
> > of connection services across the cluster for load-balancing and high
> > availability reconnection purposes in the event a Trafodion process
> should
> > fail.
> >
> > Trafodion also envisions working with the Apache Ambari project on
> enabling
> > better Trafodion manageability. While Ambari focuses on system and
> > component level performance metrics, Trafodion manageability will focus
> in
> > a complimentary way on database workload monitoring and performance
> > analytics with capabilities more geared towards database administrators.
> >
> > There are alternative open source projects that are providing
> SQL-on-Hadoop
> > capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix.
> These
> > are more focused on reporting and analytics across data structures
> > supported on HDFS. In comparison to all of these technologies Trafodion
> > provides a very complete implementation of ANSI SQL, one of the most
> > sophisticated optimizers for such workloads, a completely parallel data
> > flow architecture that does not materialize intermediate results unless
> > necessary, full ACID transactional support, ANSI GRANT/REVOKE security,
> and
> > other capabilities that would take decades to build in these products. On
> > the other hand currently Trafodion is just focused on HBase and querying
> > Hive, whereas Hive and Drill provide access to other data formats in
> HDFS.
> >
> > An Excessive Fascination with the Apache Brand
> >
> > We understand the reputation and value of the Apache brand, and no doubt
> > believe that it will help us attract contributors and users. Our primary
> > goal is to follow a proven, open source development and community
> building
> > model that will make Trafodion successful and enable better collaboration
> > with other Apache projects in the Hadoop ecosystem. We also understand
> the
> > rules and guidelines about the use of the Apache brand and intend to
> follow
> > them.
> >
> > Documentation
> >
> > Documentation and technical details on Trafodion can be found at:
> > http://www.trafodion.org/
> >
> > Initial Source
> >
> > The source is available today in a public github repository:
> > https://github.com/trafodion/trafodion.
> >
> > Source and Intellectual Property Submission Plan
> >
> > The source code has already been released under the Apache License,
> Version
> > 2. The manuals have been released in Adobe PDF format. As part of the
> > submission process, the source for the manuals will be converted from a
> > proprietary DocBook XML format to AsciiDoc.
> >
> > External Dependencies
> >
> > Two dependencies do not have Apache compatible licenses and will be
> > addressed as we enter incubation. One dependency is log4cpp, which is
> > licensed under the LGPL. A compatible alternative might be Apache
> incubator
> > project log4cxx. The other dependency is unixodbc, which is used as the
> > ODBC driver manager. We will look into how Apache Hive manages being able
> > to use this incompatible software and do similar. All other dependencies
> > have Apache compatible licenses, including Apache 2.0, MIT/X11, MIT, and
> > BSD.
> >
> > Cryptography
> >
> > Trafodion does not contain any cryptographic code. It does call
> > cryptographic libraries: OpenSSL for C++ code and Java Cryptography
> > Extension (JCE) for Java code.
> >
> > Required Resources
> >
> > Mailing Lists
> >
> > private@trafodion.incubator.apache.org
> > dev@trafodion.incubator.apache.org
> commits@trafodion.incubator.apache.org
> >
> > Git Repository
> >
> > https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git
> >
> > Issue Tracking
> >
> > JIRA: JIRA Trafodion (Trafodion)
> >
> >
> > Initial Committers and Affiliation
> >
> > Dave Birdsall, Hewlett-Packard Company, Dave.Birdsall<AT>hp<DOT>com
> > Matt Brown, Hewlett-Packard Company, mattbrown<AT>hp<DOT>com
> > Tharak Capirala, Hewlett-Packard Company, Tharak.Capirala<AT>hp<DOT>com
> > Alice Chen, Hewlett-Packard Company, Alice.Chen<AT>hp<DOT>com
> > John DeRoo, Hewlett-Packard Company, John.Deroo<AT>hp<DOT>com
> > Roberta Marton, Hewlett-Packard Company, Roberta.Marton<AT>hp<DOT>com
> > Amanda Moran, Hewlett-Packard Company, Amanda.Kay.Moran<AT>hp<DOT>com
> > Suresh Subbiah, Hewlett-Packard Company, Suresh.Subbiah<AT>hp<DOT>com
> > Sandyha Sundaresan, Hewlett-Packard Company,
> > Sandhya.Sundaresan<AT>hp<DOT>com
> >
> > Sponsors
> >
> > Champion
> >
> > Michael Stack, Stack<AT>apache<DOT>org
> >
> > Nominated Mentors
> >
> > Michael Stack, Stack<AT>apache<DOT>org
> > Roman Shaposhnik, rshaposhnik<AT>pivotal<DOT>io
> >
> > We are seeking additional mentors.
> >
> > Sponsoring Entity
> >
> > Apache Incubator PMC
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message