incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <crypt...@gmail.com>
Subject Re: [PROPOSAL] Storm for Apache Incubator
Date Thu, 05 Sep 2013 01:41:16 GMT
What does this mean for storm contribs (
https://github.com/nathanmarz/storm-contrib)? (spouts & bolts) e.g The
Apache Kafka spout already it is hard to know which to use and which is
best for 0.7.X and 0.8.X-betaX...  Is the Apache Storm project going to
help corral that or is it only for Storm core as the proposal implies with
only the storm code base https://github.com/nathanmarz/storm being part of
the project?

A lot of traffic on the existing user list is about spouts (e.g. the Kafka
Spout) and I was not sure if that would still be talked about or funneled
somewhere else or what the thoughts/plans where for the parts built within
Storm that are existing now?

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/


On Wed, Sep 4, 2013 at 4:34 PM, Nathan Marz <nathan@nathanmarz.com> wrote:

> We definitely need a storm-user list as the existing google groups mailing
> list for Storm is quite active. So we'll need to transition that over. I
> agree on adding a storm-commits list and added it to the proposal.
>
>
> On Wed, Sep 4, 2013 at 11:50 AM, Henry Saputra <henry.saputra@gmail.com
> >wrote:
>
> > Excited about Storm coming to Apache. Small comment about the mailing
> list,
> > you may want to propose having:
> > * storm-dev
> > * storm-commits
> > * storm-private (with moderated subscriptions)
> >
> > instead for starting into incubator.
> >
> > However, Storm has been a well known open source project, maybe it does
> > valid to have storm-user from the beginning. But I think you may need
> > storm-commits
> > list to separate commits log from dev discussions.
> > Mentors can chime in about this.
> >
> > Thanks,
> >
> > Henry
> >
> >
> >
> > On Wed, Sep 4, 2013 at 1:07 AM, Nathan Marz <nathan@nathanmarz.com>
> wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to propose Storm to be an Apache Incubator project. After much
> > > thought I believe this is the right next step for the project, and I
> look
> > > forward to hearing everyone's thoughts and feedback!
> > >
> > > Here's a link to the proposal:
> > > https://wiki.apache.org/incubator/StormProposal
> > >
> > > The proposal is also pasted below.
> > >
> > > -Nathan
> > >
> > >
> > > = Storm Proposal =
> > >
> > > == Abstract ==
> > >
> > > Storm is a distributed, fault-tolerant, and high-performance realtime
> > > computation system that provides strong guarantees on the processing of
> > > data.
> > >
> > > == Proposal ==
> > >
> > > Storm is a distributed real-time computation system. Similar to how
> > Hadoop
> > > provides a set of general primitives for doing batch processing, Storm
> > > provides a set of general primitives for doing real-time computation.
> Its
> > > use cases span stream processing, distributed RPC, continuous
> > computation,
> > > and more. Storm has become a preferred technology for near-realtime
> > > big-data processing by many organizations worldwide (see a partial list
> > at
> > > https://github.com/nathanmarz/storm/wiki/Powered-By). As an open
> source
> > > project, Storm’s developer community has grown rapidly to 46 members.
> > >
> > > == Background ==
> > >
> > > The past decade has seen a revolution in data processing. MapReduce,
> > > Hadoop, and related technologies have made it possible to store and
> > process
> > > data at scales previously unthinkable. Unfortunately, these data
> > processing
> > > technologies are not realtime systems, nor are they meant to be. The
> lack
> > > of a "Hadoop of realtime" has become the biggest hole in the data
> > > processing ecosystem. Storm fills that hole.
> > >
> > > Storm was initially developed and deployed at BackType in 2011. After 7
> > > months of development BackType was acquired by Twitter in July 2011.
> > Storm
> > > was open sourced in September 2011.
> > >
> > > Storm has been under continuous development on its Github repository
> > since
> > > being open-sourced. It has undergone four major releases (0.5, 0.6,
> 0.7,
> > > 0.8) and many minor ones.
> > >
> > > == Rationale ==
> > >
> > > Storm is a general platform for low-latency big-data processing. It is
> > > complementary to the existing Apache projects, such as Hadoop. Many
> > > applications are actually exploring using both Hadoop and Storm for
> > > big-data processing. Bringing Storm into Apache is very beneficial to
> > both
> > > Apache community and Storm community.
> > >
> > > The rapid growth of Storm community is empowered by open source. We
> > believe
> > > the Apache foundation is a great fit as the long-term home for Storm,
> as
> > it
> > > provides an established process for community-driven development and
> > > decision making by consensus. This is exactly the model we want for
> > future
> > > Storm development.
> > >
> > > == Initial Goals ==
> > >
> > >   * Move the existing codebase to Apache
> > >   * Integrate with the Apache development process
> > >   * Ensure all dependencies are compliant with Apache License version
> 2.0
> > >   * Incremental development and releases per Apache guidelines
> > >
> > > == Current Status ==
> > >
> > > Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
> > minor
> > > ones. Storm 0.9 is about to be released. Storm is being used in
> > production
> > > by over 50 organizations. Storm codebase is currently hosted at
> > github.com
> > > ,
> > > which will seed the Apache git repository.
> > >
> > > === Meritocracy ===
> > >
> > > We plan to invest in supporting a meritocracy. We will discuss the
> > > requirements in an open forum. Several companies have already expressed
> > > interest in this project, and we intend to invite additional developers
> > to
> > > participate. We will encourage and monitor community participation so
> > that
> > > privileges can be extended to those that contribute.
> > >
> > > === Community ===
> > >
> > > The need for a low-latency big-data processing platform in the open
> > source
> > > is tremendous. Storm is currently being used by at least 50
> organizations
> > > worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By),
> and
> > > is
> > > the most starred Java project on Github. By bringing Storm into Apache,
> > we
> > > believe that the community will grow even bigger.
> > >
> > > === Core Developers ===
> > >
> > > Storm was started by Nathan Marz at BackType, and now has developers
> from
> > > Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.
> > >
> > > === Alignment ===
> > >
> > > In the big-data processing ecosystem, Storm is a very popular
> low-latency
> > > platform, while Hadoop is the primary platform for batch processing. We
> > > believe that it will help the further growth of big-data community by
> > > having Hadoop and Storm aligned within Apache foundation. The alignment
> > is
> > > also beneficial to other Apache communities (such as Zookeeper, Thrift,
> > > Mesos). We could include additional sub-projects, Storm-on-YARN and
> > > Storm-on-Mesos, in the near future.
> > >
> > > == Known Risks ==
> > >
> > > === Orphaned Products ===
> > >
> > > The risk of the Storm project being abandoned is minimal. There are at
> > > least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu,
> > > Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized
> to
> > > continue development. Many of these organizations have built critical
> > > business applications upon Storm, and have devoted significant internal
> > > infrastructure investment in Storm.
> > >
> > > === Inexperience with Open Source ===
> > >
> > > Storm has existed as a healthy open source project for several years.
> > > During that time, we have curated an open-source community
> successfully,
> > > attracting over 40 developers from a diverse group of companies
> including
> > > Twitter, Yahoo!, and Alibaba.
> > >
> > > === Homogenous Developers ===
> > >
> > > The initial committers are employed by large companies (including
> > Twitter,
> > > Yahoo!, Alibaba, Microsoft) and well-funded startups. Storm has an
> active
> > > community of developers, and we are committed to recruiting additional
> > > committers based on their contributions to the project.
> > >
> > > === Reliance on Salaried Developers ===
> > >
> > > It is expected that Storm development will occur on both salaried time
> > and
> > > on volunteer time, after hours. The majority of initial committers are
> > paid
> > > by their employer to contribute to this project. However, they are all
> > > passionate about the project, and we are confident that the project
> will
> > > continue even if no salaried developers contribute to the project. We
> are
> > > committed to recruiting additional committers including non-salaried
> > > developers.
> > >
> > > === Relationships with Other Apache Products ===
> > >
> > > As mentioned in the Alignment section, Storm is closely integrated with
> > > Hadoop,
> > > Zookeeper, Thrift, YARN and Mesos in a numerous ways. We look forward
> to
> > > collaborating with those communities, as well as other Apache
> communities
> > > (including Apache S4 which focuses on stateful low-latency processing).
> > >
> > > === An Excessive Fascination with the Apache Brand ===
> > >
> > > Storm is already a healthy and well known open source project. This
> > > proposal is not for the purpose of generating publicity. Rather, the
> > > primary benefits to joining Apache are those outlined in the Rationale
> > > section.
> > >
> > > == Documentation ==
> > >
> > > The reader will find these websites highly relevant:
> > >
> > >   * Storm website: http://storm-project.net
> > >   * Storm documentation: https://github.com/nathanmarz/storm/wiki
> > >   * Codebase: https://github.com/nathanmarz/storm
> > >   * User group: https://groups.google.com/group/storm-user
> > >
> > > == Source and Intellectual Property Submission Plan ==
> > >
> > > The Storm codebase is currently hosted on Github:
> > > https://github.com/nathanmarz/storm.
> > >
> > > This is the exact codebase that we would migrate to the Apache
> > foundation.
> > >
> > > The Storm source code is currently licensed under Eclipse Public
> License
> > > Version 1.0. Some source code was contributed under a contributor
> > agreement
> > > based on the Sun contributor agreement (v1.5). More recent code has
> been
> > > contributed under an Apache style agreement (see
> > >
> https://dl.dropboxusercontent.com/u/133901206/storm-apache-style-cla.txt
> > ).
> > >
> > > Upon entering Apache, Storm will migrate to an Apache License 2.0 with
> > all
> > > contributions licensed to the Apache Foundation. In certain cases where
> > > individuals or organizations hold copyright, we will ensure they grant
> a
> > > license to the Apache Foundation. Going forward, all commits will be
> > > licensed directly to the Apache foundation through our signed
> Individual
> > > Contributor License Agreements for all committers on the project.
> > >
> > > Yahoo! is also willing to move Storm-on-YARN code from github to be a
> > > subproject of Apache Storm project. Storm-on-YARN is currently licensed
> > > under Apache License 2.0 and receive contribution under Apache style
> CLA.
> > > Upon entering Apache, Yahoo! will sign over copyright to Apache
> > foundation.
> > >
> > > == External Dependencies ==
> > >
> > > To the best of our knowledge, all of Storm dependencies (except
> 0MQ/JMQ)
> > > are distributed under Apache compatible licenses. Upon acceptance to
> the
> > > incubator, we would begin a thorough analysis of all transitive
> > > dependencies to verify this fact and introduce license checking into
> the
> > > build and release process (for instance integrating Apache Rat).
> > >
> > > Storm has used 0MQ and JMQ as the default mechanism for internal
> > messaging
> > > layer, and 0MQ/JMQ is licensed under GNU Lesser General Public License.
> > > Recently, we have made Storm messaging layer pluggable, and plan to use
> > > Netty (which is licensed under Apache License v2) as our default
> > messaging
> > > plugin (while keep 0MQ as an optional plugin).
> > >
> > > == Cryptography ==
> > >
> > > We do not expect Storm to be a controlled export item due to the use of
> > > encryption.
> > >
> > > Storm enable encryptions via 2 plugins:
> > >
> > >   * SASL authentication plugins … Currently, we have provide “no-op”
> > > authentication and digest authentication. In near future, we will
> > introduce
> > > Kerberos authentication.
> > >   * Tuple payload serialization plugins … Storm provides plugins for
> > > plain-object serialization and blowfish encryption.
> > >
> > > == Required Resources ==
> > >
> > > === Mailing lists ===
> > >
> > > * storm-user
> > > * storm-dev
> > > * storm-private (with moderated subscriptions)
> > >
> > > === Subversion Directory ===
> > >
> > > Git is the preferred source control system: git://git.apache.org/storm
> > >
> > > === Issue Tracking ===
> > >
> > > JIRA Storm (STORM)
> > >
> > > == Initial Committers ==
> > >
> > >   * Nathan Marz <nathan at nathanmarz dot com>
> > >   * James Xu <xumingmingv at gmail dot com>
> > >   * Jason Jackson <jason at cvk dot ca>
> > >   * Andy Feng <afeng at yahoo-inc dot com>
> > >   * Flip Kromer  <flip at infochimps dot com>
> > >   * David Lao <davidlao at microsoft dot com>
> > >   * P. Taylor Goetz <ptgoetz at gmail dot com>
> > >
> > > == Affiliations ==
> > >
> > >   * Nathan Marz - Nathan’s Startup
> > >   * James Xu - Alibaba
> > >   * Jason Jackson - Twitter
> > >   * Andy Feng - Yahoo!
> > >   * Flip Kromer - Infochimps
> > >   * David Lao - Microsoft
> > >   * P. Taylor Goetz - Health Market Science
> > >
> > > == Sponsors ==
> > >
> > > === Champion ===
> > >
> > >   * Doug Cutting  <cutting at apache dot org>
> > >
> > > === Nominated Mentors ===
> > >
> > >  * Ted Dunning <tdunning at maprtech.com>
> > >  * Arvind Prabhaker <arvind at apache dot org>
> > >  * Devaraj Das <ddas at hortonworks dot com>
> > >
> > > === Sponsoring Entity ===
> > >
> > > The Apache Incubator
> > >
> >
>
>
>
> --
> Twitter: @nathanmarz
> http://nathanmarz.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message