incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Neumann <a...@apache.org>
Subject Re: [VOTE] Accept Tephra into the Apache Incubator
Date Fri, 04 Mar 2016 20:32:44 GMT
+1 (non-binding)

-Andreas.

On Fri, Mar 4, 2016 at 12:19 PM, Terence Yim <chtyim@gmail.com> wrote:

> +1 (non-binding)
>
> Terence
>
> On Fri, Mar 4, 2016 at 1:13 AM, Jean-Baptiste Onofré <jb@nanthrax.net>
> wrote:
>
> > +1 (binding)
> >
> > Regards
> > JB
> >
> >
> > On 03/04/2016 02:29 AM, Poorna Chandra wrote:
> >
> >> Hi All,
> >>
> >> Tephra proposal was sent out for discussion last week. The proposal is
> >> available at https://wiki.apache.org/incubator/TephraProposal
> >>
> >> Please vote to accept Tephra into the Apache Incubator. The vote will be
> >> open for the next 72 hours.
> >>
> >> [ ] +1 Accept Tephra as an Apache Incubator podling.
> >> [ ] +0 Abstain.
> >> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> >>
> >> Thanks,
> >> Poorna.
> >>
> >> ------
> >>
> >> = Abstract =
> >>
> >> Tephra is a system for providing globally consistent transactions on
> >> top of Apache HBase and other storage engines.
> >>
> >> = Proposal =
> >>
> >> Tephra is a transaction engine for distributed data stores like Apache
> >> HBase.
> >> It provides ACID semantics for concurrent data operations that span over
> >> region
> >> boundaries in HBase using Optimistic Concurrency Control.
> >>
> >> = Background =
> >>
> >> HBase provides strong consistency with row- or region-level ACID
> >> operations. However, it sacrifices cross-region and cross-table
> >> consistency in favor of scalability. This trade-off requires application
> >> developers to handle  the complexity of ensuring consistency when their
> >> modifications span region boundaries. By providing support for global
> >> transactions that span regions, tables, or multiple RPCs,
> >> Tephra simplifies application development on top of HBase, without a
> >> significant impact on performance or scalability for many workloads.
> >>
> >> Tephra leverages HBase’s native data versioning to provide
> multi-versioned
> >> concurrency control (MVCC) for transactional reads and writes.
> >> With MVCC capability, each transaction sees its own consistent
> “snapshot”
> >> of
> >> data, providing snapshot isolation of concurrent transactions.
> >> MVCC along with conflict detection and handling enables Optimistic
> >> Concurrency
> >> Control.
> >>
> >> Tephra consists of three main components:
> >>   * Transaction Server – maintains global view of transaction state,
> >> assigns
> >>     new transaction IDs and performs conflict detection;
> >>   * Transaction Client – coordinates start, commit, and rollback of
> >> transactions; and
> >>   * Transaction Processor Coprocessor – applies filtering to the data
> >> read (based
> >>     on a given transaction’s state) and cleans up any data from old
> >>     (no longer visible) transactions.
> >>
> >> Although Tephra only supports HBase now, it can be extended to support
> >> transactions on any store that has multi-versioning and rollback
> >> support. The transactions
> >> can span over multiple stores and storage paradigms.
> >>
> >> = Rationale =
> >>
> >> Tephra has simple abstractions which can be used by an application to
> >> add transaction support over HBase. By abstracting away transaction
> >> handling using Tephra, the application is freed of
> >> transaction logic, and the application developer can focus on the use
> >> case.
> >> Also, Tephra can be extended to support transactions on data sources
> other
> >> than HBase.
> >>
> >> By making Tephra an Apache open source project, we believe that there
> will
> >> be wider adoption and more opportunities for Tephra to be integrated
> >> into other Apache projects.
> >>
> >> = Current Status =
> >>
> >> Tephra was built at Cask Data Inc. initially as part of
> >> open-source framework Cask Data Application Platform (CDAP)
> >> [[http://cdap.io/]].
> >> It was later converted into an independent open source project with
> >> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> >>
> >> Tephra is used in CDAP as the transaction engine. As part of CDAP,
> Tephra
> >> has been deployed at multiple companies.
> >>
> >> Apache Phoenix is using Tephra as transaction engine in the next
> release.
> >>
> >> == Meritocracy ==
> >>
> >> Our intent with this incubator proposal is to start building a diverse
> >> developer community around Tephra following the Apache meritocracy
> model.
> >> Since Tephra was initially developed in early 2013, we have had fast
> >> adoption and contributions within Cask Data. We are looking forward to
> >> new contributors. We wish to build a community based on Apache's
> >> meritocracy principles, working with those who contribute significantly
> to
> >> the project and welcoming them to be committers both during the
> incubation
> >> process and beyond.
> >>
> >> == Community ==
> >>
> >> Core developers of Tephra are at Cask Data. Recently the developer
> >> community
> >> has expanded to include folks from Apache Phoenix. We hope to extend our
> >> contributor base significantly and we will invite all who are interested
> >> in working on distributed transaction engine.
> >>
> >> == Core Developers ==
> >>
> >> A few engineers from Cask Data and outside have developed Tephra:
> >> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> >> Poorna Chandra.
> >>
> >>
> >> == Alignment ==
> >>
> >> The ASF is the natural choice to host the Tephra project as its goal of
> >> encouraging community-driven open source projects fits with our vision
> for
> >> Tephra.
> >>
> >> Additionally, many other projects with which we are familiar and expect
> >> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> >> others
> >> mentioned in the External Dependencies section are Apache projects, and
> >> Tephra will benefit by close proximity to them.
> >>
> >> = Known Risks =
> >>
> >> == Orphaned Products ==
> >>
> >> There is very little risk of Tephra being orphaned, as it is a key part
> of
> >> Cask Data’s products. The core Tephra developers plan to continue to
> work
> >> on Tephra, and Cask Data has funding in place to support their efforts
> >> going forward.
> >> Also with Phoenix using Tephra for transactions, Phoenix developers are
> >> keen on contributing to Tephra.
> >>
> >>
> >> == Inexperience with Open Source ==
> >>
> >> Several of the core developers have experience with open source
> >> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> >> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> >> is an Apache committer for Twill. Gary Helmling is a committer for
> >> Apache Twill and a committer and PMC member for Apache HBase.
> >> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
> >> Calcite,
> >> and an IPMC member.
> >>
> >> == Homogeneous Developers ==
> >>
> >> The current core developers are all Cask Data employees. However, we
> >> intend to establish a developer community that includes independent and
> >> corporate contributors. We are encouraging new contributors via our
> >> mailing
> >> lists, public presentations, and personal contacts, and we will continue
> >> to
> >> do so.
> >>
> >> Apache Phoenix developers have already contributed several patches to
> >> Tephra,
> >> and have expressed interest in becoming long term contributors.
> >>
> >> == Reliance on Salaried Developers ==
> >>
> >> Currently, these developers are paid to work on Tephra. Once the project
> >> has
> >> built a community, we expect to attract committers, developers and
> >> community
> >> other than the current core developers. However, because Cask Data
> >> products use Tephra internally, the reliance on salaried developers is
> >> unlikely to change, at least in the near term.
> >>
> >> == Relationships with Other Apache Products ==
> >>
> >> Tephra is deeply integrated with Apache projects. Tephra provides
> >> transactions
> >> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> >> coordination.
> >> A number of other Apache projects are Tephra dependencies, and are
> >> listed in the External Dependencies section.
> >>
> >> In addition, Apache Phoenix is using Tephra as the transaction engine.
> >>
> >> == An Excessive Fascination with the Apache Brand ==
> >>
> >> While we respect the reputation of the Apache brand and have no doubt
> that
> >> it will attract contributors and users, our interest is primarily to
> give
> >> Tephra a solid home as an open source project following an established
> >> development model. We have also given additional reasons in the
> Rationale
> >> and Alignment sections.
> >>
> >> = Documentation =
> >>
> >> The current documentation for Tephra is at
> >> https://github.com/caskdata/tephra.
> >>
> >> = Initial Source =
> >>
> >> Tephra codebase is currently hosted at
> https://github.com/caskdata/tephra
> >> .
> >>
> >> = Source and Intellectual Property Submission Plan =
> >>
> >> Tephra codebase is currently licensed under Apache 2.0 license.
> >> Cask Data owns the trademark for "Tephra". As part of the incubation
> >> process
> >> Cask Data will transfer the trademark to Apache Foundation.
> >>
> >> = External Dependencies =
> >>
> >> The dependencies all have Apache-compatible licenses:
> >>   * dropwizard metrics (Apache 2.0)
> >>   * fastutil (Apache 2.0)
> >>   * gson (Apache 2.0)
> >>   * guava-libraries (Apache 2.0)
> >>   * guice (Apache 2.0)
> >>   * hadoop (Apache 2.0)
> >>   * hbase (Apache 2.0)
> >>   * hdfs (Apache 2.0)
> >>   * junit (EPL v1.0)
> >>   * logback (EPL v1.0 )
> >>   * slf4j (MIT)
> >>   * thrift (Apache 2.0)
> >>   * twill (Apache 2.0)
> >>   * zookeeper (Apache 2.0)
> >>
> >> = Cryptography =
> >>
> >> Tephra does not use cryptography itself, however it can run on secure
> >> Hadoop,
> >> which uses Kerberos.
> >>
> >> = Required Resources =
> >>
> >> == Mailing Lists ==
> >>
> >>   * tephra-private for private PMC discussions (with moderated
> >> subscriptions)
> >>   * tephra-dev for technical discussions among contributors
> >>   * tephra-commits for notification about commits
> >>
> >> == Subversion Directory ==
> >>
> >> Git is the preferred source control system: git://git.apache.org/tephra
> >>
> >> == Issue Tracking ==
> >>
> >> JIRA Tephra (TEPHRA)
> >>
> >> == Other Resources ==
> >>
> >> The existing code already has unit tests, so we would like a Hudson
> >> instance to run them whenever a new patch is submitted. This can be
> added
> >> after project creation.
> >>
> >> = Initial Committers =
> >>
> >>   * Andreas Neumann <anew at apache dot org>
> >>   * Terence Yim <chtyim at apache dot org>
> >>   * Poorna Chandra <poorna at apache dot org>
> >>   * Gokul Gunasekaran <gokul at cask dot co>
> >>   * James Taylor <jamestaylor at apache dot org>
> >>   * Thomas D'Silva <tdsilva at apache dot org>
> >>   * Gary Helmling <garyh at apache dot org>
> >>
> >> = Affiliations =
> >>
> >>   * Andreas Neumann (Cask Data)
> >>   * Terence Yim (Cask Data)
> >>   * Poorna Chandra (Cask Data)
> >>   * Gokul Gunasekaran (Cask Data)
> >>   * James Taylor (Salesforce.com)
> >>   * Thomas D'Silva (Salesforce.com)
> >>   * Gary Helmling (Facebook)
> >>
> >> = Sponsors =
> >>
> >> == Champion ==
> >>
> >> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> >>
> >> == Nominated Mentors ==
> >>
> >>   * James Taylor <jamestaylor at apache dot org>
> >>   * Lars Hofhansl <larsh at apache dot org>
> >>   * Andrew Purtell <apurtell at apache dot org>
> >>   * Alan Gates <gates at apache dot org>
> >>   * Henry Saputra <hsaputra at apache dot org>
> >>
> >> == Sponsoring Entity ==
> >>
> >> We are requesting that the Incubator sponsor this project.
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message