incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Srinivas <sur...@hortonworks.com>
Subject Re: [VOTE] Accept Storm into the Incubator
Date Fri, 13 Sep 2013 21:59:19 GMT
+1 (non-binding)


On Thu, Sep 12, 2013 at 12:19 PM, Doug Cutting <cutting@apache.org> wrote:

> Discussion about the Storm proposal has subsided, issues raised now
> seemingly resolved.
>
> I'd like to call a vote to accept Storm as a new Incubator podling.
>
> The proposal is included below and is also at:
>
>   https://wiki.apache.org/incubator/StormProposal
>
> Let's keep the vote open for four working days, until 18 September.
>
> [ ] +1 Accept Storm into the Incubator
> [ ] +0 Don't care.
> [ ] -1 Don't accept Storm because...
>
> Doug
>
>
> = Storm Proposal =
>
> == Abstract ==
>
> Storm is a distributed, fault-tolerant, and high-performance realtime
> computation system that provides strong guarantees on the processing
> of data.
>
> == Proposal ==
>
> Storm is a distributed real-time computation system. Similar to how
> Hadoop provides a set of general primitives for doing batch
> processing, Storm provides a set of general primitives for doing
> real-time computation. Its use cases span stream processing,
> distributed RPC, continuous computation, and more. Storm has become a
> preferred technology for near-realtime big-data processing by many
> organizations worldwide (see a partial list at
> https://github.com/nathanmarz/storm/wiki/Powered-By). As an open
> source project, Storm’s developer community has grown rapidly to 46
> members.
>
> == Background ==
>
> The past decade has seen a revolution in data processing. MapReduce,
> Hadoop, and related technologies have made it possible to store and
> process data at scales previously unthinkable. Unfortunately, these
> data processing technologies are not realtime systems, nor are they
> meant to be. The lack of a "Hadoop of realtime" has become the biggest
> hole in the data processing ecosystem. Storm fills that hole.
>
> Storm was initially developed and deployed at BackType in 2011. After
> 7 months of development BackType was acquired by Twitter in July 2011.
> Storm was open sourced in September 2011.
>
> Storm has been under continuous development on its Github repository
> since being open-sourced. It has undergone four major releases (0.5,
> 0.6, 0.7, 0.8) and many minor ones.
>
>
> == Rationale ==
>
> Storm is a general platform for low-latency big-data processing. It is
> complementary to the existing Apache projects, such as Hadoop. Many
> applications are actually exploring using both Hadoop and Storm for
> big-data processing. Bringing Storm into Apache is very beneficial to
> both Apache community and Storm community.
>
> The rapid growth of Storm community is empowered by open source. We
> believe the Apache foundation is a great fit as the long-term home for
> Storm, as it provides an established process for community-driven
> development and decision making by consensus. This is exactly the
> model we want for future Storm development.
>
> == Initial Goals ==
>
>    * Move the existing codebase to Apache
>    * Integrate with the Apache development process
>    * Ensure all dependencies are compliant with Apache License version 2.0
>    * Incremental development and releases per Apache guidelines
>
> == Current Status ==
>
> Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
> minor ones. Storm 0.9 is about to be released. Storm is being used in
> production by over 50 organizations. Storm codebase is currently
> hosted at github.com, which will seed the Apache git repository.
>
> === Meritocracy ===
>
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have already
> expressed interest in this project, and we intend to invite additional
> developers to participate. We will encourage and monitor community
> participation so that privileges can be extended to those that
> contribute.
>
> === Community ===
>
> The need for a low-latency big-data processing platform in the open
> source is tremendous. Storm is currently being used by at least 50
> organizations worldwide (see
> https://github.com/nathanmarz/storm/wiki/Powered-By), and is the most
> starred Java project on Github. By bringing Storm into Apache, we
> believe that the community will grow even bigger.
>
> === Core Developers ===
>
> Storm was started by Nathan Marz at BackType, and now has developers
> from Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.
>
> === Alignment ===
>
> In the big-data processing ecosystem, Storm is a very popular
> low-latency platform, while Hadoop is the primary platform for batch
> processing. We believe that it will help the further growth of
> big-data community by having Hadoop and Storm aligned within Apache
> foundation. The alignment is also beneficial to other Apache
> communities (such as Zookeeper, Thrift, Mesos). We could include
> additional sub-projects, Storm-on-YARN and Storm-on-Mesos, in the near
> future.
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> The risk of the Storm project being abandoned is minimal. There are at
> least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu,
> Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized
> to continue development. Many of these organizations have built
> critical business applications upon Storm, and have devoted
> significant internal infrastructure investment in Storm.
>
> === Inexperience with Open Source ===
>
> Storm has existed as a healthy open source project for several years.
> During that time, we have curated an open-source community
> successfully, attracting over 40 developers from a diverse group of
> companies including Twitter, Yahoo!, and Alibaba.
>
> === Homogenous Developers ===
>
> The initial committers are employed by large companies (including
> Twitter, Yahoo!, Alibaba, Microsoft) and well-funded startups. Storm
> has an active community of developers, and we are committed to
> recruiting additional committers based on their contributions to the
> project.
>
> === Reliance on Salaried Developers ===
>
> It is expected that Storm development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers
> are paid by their employer to contribute to this project. However,
> they are all passionate about the project, and we are confident that
> the project will continue even if no salaried developers contribute to
> the project. We are committed to recruiting additional committers
> including non-salaried developers.
>
> === Relationships with Other Apache Products ===
>
> As mentioned in the Alignment section, Storm is closely integrated with
> Hadoop,
> Zookeeper, Thrift, YARN and Mesos in a numerous ways. We look forward
> to collaborating with those communities, as well as other Apache
> communities (including Apache S4 which focuses on stateful low-latency
> processing).
>
> === An Excessive Fascination with the Apache Brand ===
>
> Storm is already a healthy and well known open source project. This
> proposal is not for the purpose of generating publicity. Rather, the
> primary benefits to joining Apache are those outlined in the Rationale
> section.
>
> == Documentation ==
>
> The reader will find these websites highly relevant:
>    * Storm website: http://storm-project.net
>    * Storm documentation: https://github.com/nathanmarz/storm/wiki
>    * Codebase: https://github.com/nathanmarz/storm
>    * User group: https://groups.google.com/group/storm-user
>
> == Source and Intellectual Property Submission Plan ==
>
> The Storm codebase is currently hosted on Github:
> https://github.com/nathanmarz/storm.
> This is the exact codebase that we would migrate to the Apache foundation.
>
> The Storm source code is currently licensed under Eclipse Public
> License Version 1.0. Some source code was contributed under a
> contributor agreement based on the Sun contributor agreement (v1.5).
> More recent code has been contributed under an Apache style agreement
> (see
> https://dl.dropboxusercontent.com/u/133901206/storm-apache-style-cla.txt).
>
> Upon entering Apache, Storm will migrate to an Apache License 2.0 with
> all contributions licensed to the Apache Foundation. In certain cases
> where individuals or organizations hold copyright, we will ensure they
> grant a license to the Apache Foundation. Going forward, all commits
> will be licensed directly to the Apache foundation through our signed
> Individual Contributor License Agreements for all committers on the
> project.
>
> storm-kafka, which lets one use Kafka as a source for Storm, will also
> be submitted under the contrib folder for the Apache Storm project.
>
> Yahoo! is also willing to move Storm-on-YARN code from github to be a
> subproject of Apache Storm project. Storm-on-YARN is currently
> licensed under Apache License 2.0 and receive contribution under
> Apache style CLA. Upon entering Apache, Yahoo! will sign over
> copyright to Apache foundation.
>
> == External Dependencies ==
>
> To the best of our knowledge, all of Storm dependencies (except
> 0MQ/JMQ) are distributed under Apache compatible licenses. Upon
> acceptance to the incubator, we would begin a thorough analysis of all
> transitive dependencies to verify this fact and introduce license
> checking into the build and release process (for instance integrating
> Apache Rat).
>
> Storm has used 0MQ and JMQ as the default mechanism for internal
> messaging layer, and 0MQ/JMQ is licensed under GNU Lesser General
> Public License. Recently, we have made Storm messaging layer
> pluggable, and plan to use Netty (which is licensed under Apache
> License v2) as our default messaging plugin (while keep 0MQ as an
> optional plugin).
>
> == Cryptography ==
>
> We do not expect Storm to be a controlled export item due to the use
> of encryption.
> Storm enable encryptions via 2 plugins:
>    * SASL authentication plugins … Currently, we have provide “no-op”
> authentication and digest authentication. In near future, we will
> introduce Kerberos authentication.
>    * Tuple payload serialization plugins … Storm provides plugins for
> plain-object serialization and blowfish encryption.
>
> == Required Resources ==
>
> === Mailing lists ===
>
>  * storm-user
>  * storm-dev
>  * storm-commits
>  * storm-private (with moderated subscriptions)
>
> === Subversion Directory ===
>
> Git is the preferred source control system: git://git.apache.org/storm
>
>
> === Issue Tracking ===
>
> JIRA Storm (STORM)
>
> == Initial Committers ==
>
>    * Nathan Marz <nathan at nathanmarz dot com>
>    * James Xu <xumingmingv at gmail dot com>
>    * Jason Jackson <jason at cvk dot ca>
>    * Andy Feng <afeng at yahoo-inc dot com>
>    * Flip Kromer  <flip at infochimps dot com>
>    * David Lao <davidlao at microsoft dot com>
>    * P. Taylor Goetz <ptgoetz at gmail dot com>
>
> == Affiliations ==
>
>    * Nathan Marz - Nathan’s Startup
>    * James Xu - Alibaba
>    * Jason Jackson - Twitter
>    * Andy Feng - Yahoo!
>    * Flip Kromer - Infochimps
>    * David Lao - Microsoft
>    * P. Taylor Goetz - Health Market Science
>
> == Sponsors ==
>
>
> === Champion ===
>
>    * Doug Cutting  <cutting at apache dot org>
>
> === Nominated Mentors ===
>
>   * Ted Dunning <tdunning at maprtech dot com>
>   * Arvind Prabhakar <arvind at apache dot org>
>   * Devaraj Das <ddas at hortonworks dot com>
>   * Matt Franklin <m.ben.franklin at gmail dot com>
>   * Benjamin Hindman <benjamin.hindman at gmail dot com>
>
> === Sponsoring Entity ===
>
>  The Apache Incubator
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message