incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <e...@apache.org>
Subject Re: [DISCUSS] Trafodion Incubation Proposal
Date Tue, 12 May 2015 17:42:54 GMT
This looks pretty good. I can also join as a mentor.

Enis

On Sun, May 10, 2015 at 8:06 PM, Stack <stack@duboce.net> wrote:

> On Sun, May 10, 2015 at 7:13 PM, Konstantin Boudnik <cos@apache.org>
> wrote:
>
> > I think it'd be great to have SQL platform for Hadoop
> >
> > +1
> >
> > I am mentoring 4 projects at the moment, but if you need a 1/2 time
> mentor
> > -
> > count me in ;)
> >
> > Cos
> >
> >
> We'll take you up on your kind offer if we can't get someone less loaded.
>
> Thanks Cos,
>
> St.Ack
>
>
> > On Fri, May 08, 2015 at 02:59PM, Stack wrote:
> > > I would like to start up a discussion on Trafodion joining the ASF as
> an
> > > incubating project.
> > >
> > > Trafodion is a webscale SQL-on-Hadoop solution that enables
> transactional
> > > or operational workloads on Hadoop, .
> > >
> > > The proposal is available on the wiki here:
> > > https://wiki.apache.org/incubator/TrafodionProposal#preview
> > >
> > > The proposal text is also attached to the end of this email.
> > >
> > > Trafodion is a rich, storied SQL engine that has recently been ported
> to
> > > run on HBase and Hadoop. I think it would make for a fine addition to
> the
> > > Apache family of projects  It would be good to hear what others think.
> > >
> > > Thank you in advance for giving the proposal a read.
> > >
> > > Yours,
> > > St.Ack
> > >
> > >
> > > Trafodion Apache Incubator Proposal
> > >
> > > Abstract
> > >
> > > Trafodion is a webscale SQL-on-Hadoop solution enabling transactional
> or
> > > operational workloads on Hadoop.
> > >
> > > Proposal
> > >
> > > Apache Trafodion builds on the scalability, elasticity, and flexibility
> > of
> > > Hadoop. Trafodion extends Hadoop to provide guaranteed transactional
> > > integrity, enabling new kinds of big data applications to run on
> Hadoop.
> > Key
> > > features of Apache Trafodion include:
> > >
> > > * Full-functioned ANSI SQL language support
> > > * JDBC/ODBC connectivity for Linux/Windows clients
> > > * Distributed ACID transaction protection across multiple statements,
> > > tables and rows
> > > * Performance improvements for OLTP workloads with compile-time and
> > > run-time optimizations
> > > * Support for large data sets using a parallel-aware query optimizer
> > > * ANSI SQL security and data integrity constraints including
> referential
> > > integrity
> > >
> > > Hewlett-Packard Company submits this proposal to donate its Apache
> > License,
> > > Version 2.0 open source project known as Trafodion, its source code,
> > > documentation, and web site content to the Apache Software Foundation
> in
> > > order to build an open source community
> > >
> > > Background
> > >
> > > Trafodion is an open source project sponsored by HP, incubated at HP
> Labs
> > > and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution
> > targeting
> > > big data transactional or operational workloads. HP publically
> announced
> > > the open source project and uploaded the source code to GitHub in June
> > 2014.
> > >
> > > The SQL compiler, optimizer and executor components of Trafodion have a
> > > rich heritage. Under development since 1993, they were released as
> > > commercial closed source software in various flavors such as HP NonStop
> > > SQL/MX and HP Neoview. NonStop SQL/MX was designed for online
> transaction
> > > processing on HP’s NonStop (formerly Tandem) fault-tolerant servers and
> > is
> > > known for its high availability, scalability, and performance. Hundreds
> > of
> > > companies and thousands of servers are running mission-critical
> > > applications today on NonStop SQL/MX. In addition, much of these
> > components
> > > today are running internal to HP as the core of its Enterprise Data
> > > Warehouse (EDW), managing over a PB of data.
> > >
> > > Starting in 2013, the software was modified to run on HBase and a new
> > > distributed transaction manager was written to run as an HBase
> > co-processor.
> > >
> > > Unlike most NOSQL and other SQL-on-Hadoop open source projects,
> Trafodion
> > > provides comprehensive ANSI SQL language support including
> > full-functioned
> > > data definition (DDL), data manipulation (DML), transaction control
> (TCL)
> > > and database utility support.
> > >
> > > Trafodion provides comprehensive and standard SQL data manipulation
> > support
> > > including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with
> > > language options including join variants, unions, where predicates,
> > > aggregations (group by and having), sort ordering, sampling, correlated
> > and
> > > nested sub-queries, cursors, and many SQL functions.
> > >
> > > Utilities are provided for updating table statistics used by the
> > optimizer
> > > for costing (i.e. selectivity/cardinality estimates) plan alternatives,
> > for
> > > displaying the chosen SQL execution plan, plan shaping, backup and
> > > restoring the database, data loading and unloading, and a command line
> > > utility for interfacing with the database engine.
> > >
> > > Explicit control statements are provided to allow applications to
> define
> > > transaction boundaries and to abort transactions when warranted,
> > including
> > > BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION.
> > >
> > > Trafodion supports ANSI’s grant/revoke semantics to define user and
> role
> > > privileges in terms of managing and accessing the database objects.
> > >
> > > Rationale
> > >
> > > The name “Trafodion” (the Welsh word for transactions, pronounced
> > > “Tra-vod-eee-on”) was chosen specifically to emphasize the
> > differentiation
> > > that Trafodion provides in closing a critical gap in the Hadoop
> > ecosystem.
> > > Trafodion builds on the scalability, elasticity, and flexibility of
> > Hadoop.
> > > Trafodion extends Hadoop to provide guaranteed transactional integrity,
> > > enabling new kinds of big data applications to run on Hadoop.
> > >
> > > Current Status
> > >
> > > HP released the Trafodion code under the Apache License, Version 2, in
> > June
> > > of 2014. Since that time, we have had one major release in January 2015
> > and
> > > one minor release in April 2015. The focus of these releases has been
> in
> > > getting our base functionality, including security, working on top of
> > > Apache HBase, as well as improving performance, availability and
> > > scalability, and integrating better with HBase.
> > >
> > > Meritocracy
> > >
> > > We want to build a diverse developer community, based on the Apache
> Way,
> > > around Trafodion. To help developers become contributors, we have
> > > documentation on the wiki about the architecture, the source tree
> > > structure, and an example enhancement. We plan to publish our project
> > > backlog to the community, specifically highlighting areas where
> > developers
> > > new to Trafodion may best start contributing, such as extending the
> > > database functionality with User Defined Routines (UDRs) and
> integrating
> > > with other Apache projects in the Hadoop ecosystem.
> > >
> > > Community
> > >
> > > We have already begun building a community but at this time the
> community
> > > consists only of Trafodion developers – all HP employees – and
> > prospective
> > > users. We have participated in and hosted HBase Meetups and intend to
> > ramp
> > > up our community building efforts.
> > >
> > > The Trafodion project has seen interest in China, where HP has
> conducted
> > > proof-of-concepts with multiple companies and expects to see some of
> its
> > > first commercial deployments. To help recruit contributors and users in
> > > China, members of the team are translating Trafodion wiki content into
> > > Mandarin.
> > >
> > > Core Developers
> > >
> > > The core developers are very experienced in database and transaction
> > > monitor technology, with many having spent more than 20 years working
> in
> > > this space.
> > >
> > > Alignment
> > >
> > > Apache Trafodion relies on Apache HBase as its storage engine. The
> > > development team has collaborated with and gained valuable advice from
> > > working with the Apache HBase core developers. Apache Trafodion has
> > > federation capabilities as well, and can query Trafodion tables stored
> in
> > > HBase, native HBase tables, and Apache Hive tables.
> > >
> > > Known Risks
> > >
> > > Orphaned Products
> > >
> > > HP Labs and HP-IT have been incubating Trafodion development for almost
> > two
> > > years. This is part of HP’s strategy to leverage its investment in
> > database
> > > software and bring software to market as open source and is similar to
> > HP’s
> > > efforts with OpenStack. Trafodion builds on HP’s equity investment in
> the
> > > Hadoop ecosystem and its efforts to monetize Hadoop through hardware,
> > > software, and services. HP wants Trafodion to be successful, as HP will
> > > offer a commercially supported distribution of Trafodion.
> > >
> > > Inexperience with Open Source
> > >
> > > We have been working with open source software in building closed
> source
> > > software for well over two decades. To help transition to doing open
> > source
> > > development, the development team received guidance and best practices
> > from
> > > HP developers working on OpenStack open source projects, many of whom
> > have
> > > experience working on Apache and other open source projects as well.
> > Since
> > > releasing Trafodion as an open source project in June of 2014, the
> > > committers and contributors have moved forward using open source
> > > development processes and tools for bug tracking and design blueprints
> > and
> > > Jenkins for continuous integration. As part of the incubation process,
> we
> > > recognize we may need to change some of our development processes/tools
> > and
> > > conduct our discussions using Apache email dlists.
> > >
> > > Homogenous Developers
> > >
> > > Since the initial development of Trafodion has been supported by HP,
> all
> > of
> > > the current developers are HP employees. Through the support of the
> > Apache
> > > incubation project, we aim to expand the list of developers and gain
> > > contributors from related SQL-on-Hadoop projects and the Apache HBase
> > > project. Trafodion developers are experienced with distributed
> > development
> > > processes, being primarily based in Palo Alto, CA; Austin, TX; and
> > > Shanghai, China. Trafodion is written in C++ and Java.
> > >
> > > Reliance on Salaried Developers
> > >
> > > Currently all of the developers working on the project are paid by
> their
> > > employer to work on the project. These developers will work on the open
> > > source project as well as work on the commercially supported
> distribution
> > > of Trafodion that HP will offer.
> > >
> > > Relationship with Other Apache Products
> > >
> > > Trafodion is built upon Apache HBase and extends it to support ACID
> > > transactions with HBase co-processors for distributed transaction
> > > management and recovery. Trafodion envisions future collaborations with
> > the
> > > Apache HBase project on performance optimizations, such as in the areas
> > of
> > > mixed workload support, High Availability, etc. It also provides
> > > transactional support and querying from native HBase tables as well.
> > >
> > > Trafodion uses Apache Zookeeper to coordinate and manage the
> distribution
> > > of connection services across the cluster for load-balancing and high
> > > availability reconnection purposes in the event a Trafodion process
> > should
> > > fail.
> > >
> > > Trafodion also envisions working with the Apache Ambari project on
> > enabling
> > > better Trafodion manageability. While Ambari focuses on system and
> > > component level performance metrics, Trafodion manageability will focus
> > in
> > > a complimentary way on database workload monitoring and performance
> > > analytics with capabilities more geared towards database
> administrators.
> > >
> > > There are alternative open source projects that are providing
> > SQL-on-Hadoop
> > > capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix.
> > These
> > > are more focused on reporting and analytics across data structures
> > > supported on HDFS. In comparison to all of these technologies Trafodion
> > > provides a very complete implementation of ANSI SQL, one of the most
> > > sophisticated optimizers for such workloads, a completely parallel data
> > > flow architecture that does not materialize intermediate results unless
> > > necessary, full ACID transactional support, ANSI GRANT/REVOKE security,
> > and
> > > other capabilities that would take decades to build in these products.
> On
> > > the other hand currently Trafodion is just focused on HBase and
> querying
> > > Hive, whereas Hive and Drill provide access to other data formats in
> > HDFS.
> > >
> > > An Excessive Fascination with the Apache Brand
> > >
> > > We understand the reputation and value of the Apache brand, and no
> doubt
> > > believe that it will help us attract contributors and users. Our
> primary
> > > goal is to follow a proven, open source development and community
> > building
> > > model that will make Trafodion successful and enable better
> collaboration
> > > with other Apache projects in the Hadoop ecosystem. We also understand
> > the
> > > rules and guidelines about the use of the Apache brand and intend to
> > follow
> > > them.
> > >
> > > Documentation
> > >
> > > Documentation and technical details on Trafodion can be found at:
> > > http://www.trafodion.org/
> > >
> > > Initial Source
> > >
> > > The source is available today in a public github repository:
> > > https://github.com/trafodion/trafodion.
> > >
> > > Source and Intellectual Property Submission Plan
> > >
> > > The source code has already been released under the Apache License,
> > Version
> > > 2. The manuals have been released in Adobe PDF format. As part of the
> > > submission process, the source for the manuals will be converted from a
> > > proprietary DocBook XML format to AsciiDoc.
> > >
> > > External Dependencies
> > >
> > > Two dependencies do not have Apache compatible licenses and will be
> > > addressed as we enter incubation. One dependency is log4cpp, which is
> > > licensed under the LGPL. A compatible alternative might be Apache
> > incubator
> > > project log4cxx. The other dependency is unixodbc, which is used as the
> > > ODBC driver manager. We will look into how Apache Hive manages being
> able
> > > to use this incompatible software and do similar. All other
> dependencies
> > > have Apache compatible licenses, including Apache 2.0, MIT/X11, MIT,
> and
> > > BSD.
> > >
> > > Cryptography
> > >
> > > Trafodion does not contain any cryptographic code. It does call
> > > cryptographic libraries: OpenSSL for C++ code and Java Cryptography
> > > Extension (JCE) for Java code.
> > >
> > > Required Resources
> > >
> > > Mailing Lists
> > >
> > > private@trafodion.incubator.apache.org
> > > dev@trafodion.incubator.apache.org
> > commits@trafodion.incubator.apache.org
> > >
> > > Git Repository
> > >
> > > https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git
> > >
> > > Issue Tracking
> > >
> > > JIRA: JIRA Trafodion (Trafodion)
> > >
> > >
> > > Initial Committers and Affiliation
> > >
> > > Dave Birdsall, Hewlett-Packard Company, Dave.Birdsall<AT>hp<DOT>com
> > > Matt Brown, Hewlett-Packard Company, mattbrown<AT>hp<DOT>com
> > > Tharak Capirala, Hewlett-Packard Company, Tharak.Capirala<AT>hp<DOT>com
> > > Alice Chen, Hewlett-Packard Company, Alice.Chen<AT>hp<DOT>com
> > > John DeRoo, Hewlett-Packard Company, John.Deroo<AT>hp<DOT>com
> > > Roberta Marton, Hewlett-Packard Company, Roberta.Marton<AT>hp<DOT>com
> > > Amanda Moran, Hewlett-Packard Company, Amanda.Kay.Moran<AT>hp<DOT>com
> > > Suresh Subbiah, Hewlett-Packard Company, Suresh.Subbiah<AT>hp<DOT>com
> > > Sandyha Sundaresan, Hewlett-Packard Company,
> > > Sandhya.Sundaresan<AT>hp<DOT>com
> > >
> > > Sponsors
> > >
> > > Champion
> > >
> > > Michael Stack, Stack<AT>apache<DOT>org
> > >
> > > Nominated Mentors
> > >
> > > Michael Stack, Stack<AT>apache<DOT>org
> > > Roman Shaposhnik, rshaposhnik<AT>pivotal<DOT>io
> > >
> > > We are seeking additional mentors.
> > >
> > > Sponsoring Entity
> > >
> > > Apache Incubator PMC
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message