Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BBAA8177EB for ; Mon, 11 May 2015 00:26:56 +0000 (UTC) Received: (qmail 15489 invoked by uid 500); 11 May 2015 00:26:56 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 15306 invoked by uid 500); 11 May 2015 00:26:56 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 15294 invoked by uid 99); 11 May 2015 00:26:55 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 May 2015 00:26:55 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 61062183A2D for ; Mon, 11 May 2015 00:26:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id yLOUF4FVvZ9n for ; Mon, 11 May 2015 00:26:45 +0000 (UTC) Received: from mail-ig0-f174.google.com (mail-ig0-f174.google.com [209.85.213.174]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id B0B9924BCB for ; Mon, 11 May 2015 00:26:44 +0000 (UTC) Received: by igbyr2 with SMTP id yr2so57594566igb.0 for ; Sun, 10 May 2015 17:25:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=CjkfqBxSkce2/g4uiM+K1URDrRDrYT54t8ta6gRwcwI=; b=uBLjpYSUYjGRpC1K6PlM2KG8aPgApjlmDFdlmkanHqGsTvY5g86Bae9QZzwKbZqiym gYLlYNnuSUghCaCrW1z+KylFnghTxmuywtgVhk1kIfqgukywGtQWu7KdzKiOsafqCsLz vlcwL3OeLBNxOSZoH3TSMnfQQW/SwuFDKjGZ3dbMpQ2p9dfSJ0gyalIQuT1yFHP0S+qS 8IvfrC1J9CCGDgFRdFVxOpk0kAEpxKzi6WTJIETMPne+EjpdKCiuBUc/Omz/TuhtByee NSQLNl5PwZHnDYJLyUur3f+02bQmRHsNsSWdlyJ6EVyDKHD0NML2vxM9m3LLbg+qO0wl wPgQ== X-Received: by 10.42.88.197 with SMTP id d5mr8510744icm.44.1431303959057; Sun, 10 May 2015 17:25:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.81.77 with HTTP; Sun, 10 May 2015 17:25:38 -0700 (PDT) In-Reply-To: References: From: Dan Di Spaltro Date: Sun, 10 May 2015 17:25:38 -0700 Message-ID: Subject: Re: [DISCUSS] Trafodion Incubation Proposal To: general@incubator.apache.org Content-Type: multipart/alternative; boundary=90e6ba3fd2810b9bd00515c36c65 --90e6ba3fd2810b9bd00515c36c65 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I think this would be a great addition, and I am sure this will help fix the committer diversity. -Dan On Fri, May 8, 2015 at 2:59 PM, Stack wrote: > I would like to start up a discussion on Trafodion joining the ASF as an > incubating project. > > Trafodion is a webscale SQL-on-Hadoop solution that enables transactional > or operational workloads on Hadoop, . > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/TrafodionProposal#preview > > The proposal text is also attached to the end of this email. > > Trafodion is a rich, storied SQL engine that has recently been ported to > run on HBase and Hadoop. I think it would make for a fine addition to the > Apache family of projects It would be good to hear what others think. > > Thank you in advance for giving the proposal a read. > > Yours, > St.Ack > > > Trafodion Apache Incubator Proposal > > Abstract > > Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or > operational workloads on Hadoop. > > Proposal > > Apache Trafodion builds on the scalability, elasticity, and flexibility o= f > Hadoop. Trafodion extends Hadoop to provide guaranteed transactional > integrity, enabling new kinds of big data applications to run on Hadoop. > Key > features of Apache Trafodion include: > > * Full-functioned ANSI SQL language support > * JDBC/ODBC connectivity for Linux/Windows clients > * Distributed ACID transaction protection across multiple statements, > tables and rows > * Performance improvements for OLTP workloads with compile-time and > run-time optimizations > * Support for large data sets using a parallel-aware query optimizer > * ANSI SQL security and data integrity constraints including referential > integrity > > Hewlett-Packard Company submits this proposal to donate its Apache Licens= e, > Version 2.0 open source project known as Trafodion, its source code, > documentation, and web site content to the Apache Software Foundation in > order to build an open source community > > Background > > Trafodion is an open source project sponsored by HP, incubated at HP Labs > and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution targetin= g > big data transactional or operational workloads. HP publically announced > the open source project and uploaded the source code to GitHub in June > 2014. > > The SQL compiler, optimizer and executor components of Trafodion have a > rich heritage. Under development since 1993, they were released as > commercial closed source software in various flavors such as HP NonStop > SQL/MX and HP Neoview. NonStop SQL/MX was designed for online transaction > processing on HP=E2=80=99s NonStop (formerly Tandem) fault-tolerant serve= rs and is > known for its high availability, scalability, and performance. Hundreds o= f > companies and thousands of servers are running mission-critical > applications today on NonStop SQL/MX. In addition, much of these componen= ts > today are running internal to HP as the core of its Enterprise Data > Warehouse (EDW), managing over a PB of data. > > Starting in 2013, the software was modified to run on HBase and a new > distributed transaction manager was written to run as an HBase > co-processor. > > Unlike most NOSQL and other SQL-on-Hadoop open source projects, Trafodion > provides comprehensive ANSI SQL language support including full-functione= d > data definition (DDL), data manipulation (DML), transaction control (TCL) > and database utility support. > > Trafodion provides comprehensive and standard SQL data manipulation suppo= rt > including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with > language options including join variants, unions, where predicates, > aggregations (group by and having), sort ordering, sampling, correlated a= nd > nested sub-queries, cursors, and many SQL functions. > > Utilities are provided for updating table statistics used by the optimize= r > for costing (i.e. selectivity/cardinality estimates) plan alternatives, f= or > displaying the chosen SQL execution plan, plan shaping, backup and > restoring the database, data loading and unloading, and a command line > utility for interfacing with the database engine. > > Explicit control statements are provided to allow applications to define > transaction boundaries and to abort transactions when warranted, includin= g > BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION. > > Trafodion supports ANSI=E2=80=99s grant/revoke semantics to define user a= nd role > privileges in terms of managing and accessing the database objects. > > Rationale > > The name =E2=80=9CTrafodion=E2=80=9D (the Welsh word for transactions, pr= onounced > =E2=80=9CTra-vod-eee-on=E2=80=9D) was chosen specifically to emphasize th= e differentiation > that Trafodion provides in closing a critical gap in the Hadoop ecosystem= . > Trafodion builds on the scalability, elasticity, and flexibility of Hadoo= p. > Trafodion extends Hadoop to provide guaranteed transactional integrity, > enabling new kinds of big data applications to run on Hadoop. > > Current Status > > HP released the Trafodion code under the Apache License, Version 2, in Ju= ne > of 2014. Since that time, we have had one major release in January 2015 a= nd > one minor release in April 2015. The focus of these releases has been in > getting our base functionality, including security, working on top of > Apache HBase, as well as improving performance, availability and > scalability, and integrating better with HBase. > > Meritocracy > > We want to build a diverse developer community, based on the Apache Way, > around Trafodion. To help developers become contributors, we have > documentation on the wiki about the architecture, the source tree > structure, and an example enhancement. We plan to publish our project > backlog to the community, specifically highlighting areas where developer= s > new to Trafodion may best start contributing, such as extending the > database functionality with User Defined Routines (UDRs) and integrating > with other Apache projects in the Hadoop ecosystem. > > Community > > We have already begun building a community but at this time the community > consists only of Trafodion developers =E2=80=93 all HP employees =E2=80= =93 and prospective > users. We have participated in and hosted HBase Meetups and intend to ram= p > up our community building efforts. > > The Trafodion project has seen interest in China, where HP has conducted > proof-of-concepts with multiple companies and expects to see some of its > first commercial deployments. To help recruit contributors and users in > China, members of the team are translating Trafodion wiki content into > Mandarin. > > Core Developers > > The core developers are very experienced in database and transaction > monitor technology, with many having spent more than 20 years working in > this space. > > Alignment > > Apache Trafodion relies on Apache HBase as its storage engine. The > development team has collaborated with and gained valuable advice from > working with the Apache HBase core developers. Apache Trafodion has > federation capabilities as well, and can query Trafodion tables stored in > HBase, native HBase tables, and Apache Hive tables. > > Known Risks > > Orphaned Products > > HP Labs and HP-IT have been incubating Trafodion development for almost t= wo > years. This is part of HP=E2=80=99s strategy to leverage its investment i= n database > software and bring software to market as open source and is similar to HP= =E2=80=99s > efforts with OpenStack. Trafodion builds on HP=E2=80=99s equity investmen= t in the > Hadoop ecosystem and its efforts to monetize Hadoop through hardware, > software, and services. HP wants Trafodion to be successful, as HP will > offer a commercially supported distribution of Trafodion. > > Inexperience with Open Source > > We have been working with open source software in building closed source > software for well over two decades. To help transition to doing open sour= ce > development, the development team received guidance and best practices fr= om > HP developers working on OpenStack open source projects, many of whom hav= e > experience working on Apache and other open source projects as well. Sinc= e > releasing Trafodion as an open source project in June of 2014, the > committers and contributors have moved forward using open source > development processes and tools for bug tracking and design blueprints an= d > Jenkins for continuous integration. As part of the incubation process, we > recognize we may need to change some of our development processes/tools a= nd > conduct our discussions using Apache email dlists. > > Homogenous Developers > > Since the initial development of Trafodion has been supported by HP, all = of > the current developers are HP employees. Through the support of the Apach= e > incubation project, we aim to expand the list of developers and gain > contributors from related SQL-on-Hadoop projects and the Apache HBase > project. Trafodion developers are experienced with distributed developmen= t > processes, being primarily based in Palo Alto, CA; Austin, TX; and > Shanghai, China. Trafodion is written in C++ and Java. > > Reliance on Salaried Developers > > Currently all of the developers working on the project are paid by their > employer to work on the project. These developers will work on the open > source project as well as work on the commercially supported distribution > of Trafodion that HP will offer. > > Relationship with Other Apache Products > > Trafodion is built upon Apache HBase and extends it to support ACID > transactions with HBase co-processors for distributed transaction > management and recovery. Trafodion envisions future collaborations with t= he > Apache HBase project on performance optimizations, such as in the areas o= f > mixed workload support, High Availability, etc. It also provides > transactional support and querying from native HBase tables as well. > > Trafodion uses Apache Zookeeper to coordinate and manage the distribution > of connection services across the cluster for load-balancing and high > availability reconnection purposes in the event a Trafodion process shoul= d > fail. > > Trafodion also envisions working with the Apache Ambari project on enabli= ng > better Trafodion manageability. While Ambari focuses on system and > component level performance metrics, Trafodion manageability will focus i= n > a complimentary way on database workload monitoring and performance > analytics with capabilities more geared towards database administrators. > > There are alternative open source projects that are providing SQL-on-Hado= op > capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix. Thes= e > are more focused on reporting and analytics across data structures > supported on HDFS. In comparison to all of these technologies Trafodion > provides a very complete implementation of ANSI SQL, one of the most > sophisticated optimizers for such workloads, a completely parallel data > flow architecture that does not materialize intermediate results unless > necessary, full ACID transactional support, ANSI GRANT/REVOKE security, a= nd > other capabilities that would take decades to build in these products. On > the other hand currently Trafodion is just focused on HBase and querying > Hive, whereas Hive and Drill provide access to other data formats in HDFS= . > > An Excessive Fascination with the Apache Brand > > We understand the reputation and value of the Apache brand, and no doubt > believe that it will help us attract contributors and users. Our primary > goal is to follow a proven, open source development and community buildin= g > model that will make Trafodion successful and enable better collaboration > with other Apache projects in the Hadoop ecosystem. We also understand th= e > rules and guidelines about the use of the Apache brand and intend to foll= ow > them. > > Documentation > > Documentation and technical details on Trafodion can be found at: > http://www.trafodion.org/ > > Initial Source > > The source is available today in a public github repository: > https://github.com/trafodion/trafodion. > > Source and Intellectual Property Submission Plan > > The source code has already been released under the Apache License, Versi= on > 2. The manuals have been released in Adobe PDF format. As part of the > submission process, the source for the manuals will be converted from a > proprietary DocBook XML format to AsciiDoc. > > External Dependencies > > Two dependencies do not have Apache compatible licenses and will be > addressed as we enter incubation. One dependency is log4cpp, which is > licensed under the LGPL. A compatible alternative might be Apache incubat= or > project log4cxx. The other dependency is unixodbc, which is used as the > ODBC driver manager. We will look into how Apache Hive manages being able > to use this incompatible software and do similar. All other dependencies > have Apache compatible licenses, including Apache 2.0, MIT/X11, MIT, and > BSD. > > Cryptography > > Trafodion does not contain any cryptographic code. It does call > cryptographic libraries: OpenSSL for C++ code and Java Cryptography > Extension (JCE) for Java code. > > Required Resources > > Mailing Lists > > private@trafodion.incubator.apache.org > dev@trafodion.incubator.apache.org commits@trafodion.incubator.apache.org > > Git Repository > > https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git > > Issue Tracking > > JIRA: JIRA Trafodion (Trafodion) > > > Initial Committers and Affiliation > > Dave Birdsall, Hewlett-Packard Company, Dave.Birdsallhpcom > Matt Brown, Hewlett-Packard Company, mattbrownhpcom > Tharak Capirala, Hewlett-Packard Company, Tharak.Capiralahpcom > Alice Chen, Hewlett-Packard Company, Alice.Chenhpcom > John DeRoo, Hewlett-Packard Company, John.Deroohpcom > Roberta Marton, Hewlett-Packard Company, Roberta.Martonhpcom > Amanda Moran, Hewlett-Packard Company, Amanda.Kay.Moranhpcom > Suresh Subbiah, Hewlett-Packard Company, Suresh.Subbiahhpcom > Sandyha Sundaresan, Hewlett-Packard Company, > Sandhya.Sundaresanhpcom > > Sponsors > > Champion > > Michael Stack, Stackapacheorg > > Nominated Mentors > > Michael Stack, Stackapacheorg > Roman Shaposhnik, rshaposhnikpivotalio > > We are seeking additional mentors. > > Sponsoring Entity > > Apache Incubator PMC > --90e6ba3fd2810b9bd00515c36c65--