Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 00B4010265 for ; Wed, 4 Sep 2013 16:40:23 +0000 (UTC) Received: (qmail 93490 invoked by uid 500); 4 Sep 2013 16:40:19 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 93350 invoked by uid 500); 4 Sep 2013 16:40:18 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 93340 invoked by uid 99); 4 Sep 2013 16:40:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Sep 2013 16:40:16 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nathan.marz@gmail.com designates 209.85.220.45 as permitted sender) Received: from [209.85.220.45] (HELO mail-pa0-f45.google.com) (209.85.220.45) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Sep 2013 16:40:12 +0000 Received: by mail-pa0-f45.google.com with SMTP id bg4so668008pad.4 for ; Wed, 04 Sep 2013 09:39:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:references:from:content-type:in-reply-to:message-id:date:to :content-transfer-encoding:mime-version; bh=aYUVgqrVO8mTeWQ9Z9wWhmrSnabnMT0y1V+ZJa0M9o8=; b=uufdeEgHPx4maFoboxkwUP7s9ZFIh80WJ068V/B54UWZ/meQlA3PohcHHBoc/bsxZC /m7VlLdKN/PedOohRE2O/N3RF3WW2+XBMKNIT1+gj/QvwKELfIBToyH2hz5e6uDuSK6h ec4LIYDH6QfybNonZzYJBPmc6JPYsRjTIeBVPrwH4abM0bFWr0Rvk3DXF7nlx5eFOHor LF0jgQM5OB6zsPP1iVNM34/xxLmdo5+XMeS4YEvkq3QllAmtYy8L3D8bHKPaLkDoqi88 7lYmlWXFwJ7wDvsBv8cZE/KCuW2oegmjWNp4dE2suPPG/I92U5dQa4FZF7xyNJ/iiFXk CkGw== X-Received: by 10.66.161.229 with SMTP id xv5mr4325905pab.87.1378312792548; Wed, 04 Sep 2013 09:39:52 -0700 (PDT) Received: from [192.168.1.108] (c-107-3-151-72.hsd1.ca.comcast.net. [107.3.151.72]) by mx.google.com with ESMTPSA id yo2sm31856753pab.8.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 04 Sep 2013 09:39:51 -0700 (PDT) Subject: Re: [PROPOSAL] Storm for Apache Incubator References: From: Nathan Marz Content-Type: text/plain; charset=utf-8 X-Mailer: iPhone Mail (10B146) In-Reply-To: Message-Id: <310010F8-0926-4C1D-B6C1-80676CEBD996@gmail.com> Date: Wed, 4 Sep 2013 09:39:47 -0700 To: "general@incubator.apache.org" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) X-Virus-Checked: Checked by ClamAV on apache.org That's how many people have contributed code (even just one small patch). Th= e people on the committer list have all made significant and high quality co= ntributions. On Sep 4, 2013, at 3:28 AM, Reto Bachmann-Gm=C3=BCr wrote= : > +1 (unbinding) > Looking very good. Just wondering why there are only 7 initial committers > when you say that the storm developer community has 46 members. >=20 > Cheers. > Reto >=20 >=20 > On Wed, Sep 4, 2013 at 11:44 AM, Srinath Perera wrote: >=20 >> +1, look good. >>=20 >> --Srinath >>=20 >>=20 >> On Wed, Sep 4, 2013 at 1:37 PM, Nathan Marz wrote= : >>=20 >>> Hi everyone, >>>=20 >>> I'd like to propose Storm to be an Apache Incubator project. After much >>> thought I believe this is the right next step for the project, and I loo= k >>> forward to hearing everyone's thoughts and feedback! >>>=20 >>> Here's a link to the proposal: >>> https://wiki.apache.org/incubator/StormProposal >>>=20 >>> The proposal is also pasted below. >>>=20 >>> -Nathan >>>=20 >>>=20 >>> =3D Storm Proposal =3D >>>=20 >>> =3D=3D Abstract =3D=3D >>>=20 >>> Storm is a distributed, fault-tolerant, and high-performance realtime >>> computation system that provides strong guarantees on the processing of >>> data. >>>=20 >>> =3D=3D Proposal =3D=3D >>>=20 >>> Storm is a distributed real-time computation system. Similar to how >> Hadoop >>> provides a set of general primitives for doing batch processing, Storm >>> provides a set of general primitives for doing real-time computation. It= s >>> use cases span stream processing, distributed RPC, continuous >> computation, >>> and more. Storm has become a preferred technology for near-realtime >>> big-data processing by many organizations worldwide (see a partial list >> at >>> https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source >>> project, Storm=E2=80=99s developer community has grown rapidly to 46 mem= bers. >>>=20 >>> =3D=3D Background =3D=3D >>>=20 >>> The past decade has seen a revolution in data processing. MapReduce, >>> Hadoop, and related technologies have made it possible to store and >> process >>> data at scales previously unthinkable. Unfortunately, these data >> processing >>> technologies are not realtime systems, nor are they meant to be. The lac= k >>> of a "Hadoop of realtime" has become the biggest hole in the data >>> processing ecosystem. Storm fills that hole. >>>=20 >>> Storm was initially developed and deployed at BackType in 2011. After 7 >>> months of development BackType was acquired by Twitter in July 2011. >> Storm >>> was open sourced in September 2011. >>>=20 >>> Storm has been under continuous development on its Github repository >> since >>> being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,= >>> 0.8) and many minor ones. >>>=20 >>> =3D=3D Rationale =3D=3D >>>=20 >>> Storm is a general platform for low-latency big-data processing. It is >>> complementary to the existing Apache projects, such as Hadoop. Many >>> applications are actually exploring using both Hadoop and Storm for >>> big-data processing. Bringing Storm into Apache is very beneficial to >> both >>> Apache community and Storm community. >>>=20 >>> The rapid growth of Storm community is empowered by open source. We >> believe >>> the Apache foundation is a great fit as the long-term home for Storm, as= >> it >>> provides an established process for community-driven development and >>> decision making by consensus. This is exactly the model we want for >> future >>> Storm development. >>>=20 >>> =3D=3D Initial Goals =3D=3D >>>=20 >>> * Move the existing codebase to Apache >>> * Integrate with the Apache development process >>> * Ensure all dependencies are compliant with Apache License version 2.0= >>> * Incremental development and releases per Apache guidelines >>>=20 >>> =3D=3D Current Status =3D=3D >>>=20 >>> Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many >> minor >>> ones. Storm 0.9 is about to be released. Storm is being used in >> production >>> by over 50 organizations. Storm codebase is currently hosted at >> github.com >>> , >>> which will seed the Apache git repository. >>>=20 >>> =3D=3D=3D Meritocracy =3D=3D=3D >>>=20 >>> We plan to invest in supporting a meritocracy. We will discuss the >>> requirements in an open forum. Several companies have already expressed >>> interest in this project, and we intend to invite additional developers >> to >>> participate. We will encourage and monitor community participation so >> that >>> privileges can be extended to those that contribute. >>>=20 >>> =3D=3D=3D Community =3D=3D=3D >>>=20 >>> The need for a low-latency big-data processing platform in the open >> source >>> is tremendous. Storm is currently being used by at least 50 organization= s >>> worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and= >>> is >>> the most starred Java project on Github. By bringing Storm into Apache, >> we >>> believe that the community will grow even bigger. >>>=20 >>> =3D=3D=3D Core Developers =3D=3D=3D >>>=20 >>> Storm was started by Nathan Marz at BackType, and now has developers fro= m >>> Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies. >>>=20 >>> =3D=3D=3D Alignment =3D=3D=3D >>>=20 >>> In the big-data processing ecosystem, Storm is a very popular low-latenc= y >>> platform, while Hadoop is the primary platform for batch processing. We >>> believe that it will help the further growth of big-data community by >>> having Hadoop and Storm aligned within Apache foundation. The alignment >> is >>> also beneficial to other Apache communities (such as Zookeeper, Thrift, >>> Mesos). We could include additional sub-projects, Storm-on-YARN and >>> Storm-on-Mesos, in the near future. >>>=20 >>> =3D=3D Known Risks =3D=3D >>>=20 >>> =3D=3D=3D Orphaned Products =3D=3D=3D >>>=20 >>> The risk of the Storm project being abandoned is minimal. There are at >>> least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu, >>> Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized t= o >>> continue development. Many of these organizations have built critical >>> business applications upon Storm, and have devoted significant internal >>> infrastructure investment in Storm. >>>=20 >>> =3D=3D=3D Inexperience with Open Source =3D=3D=3D >>>=20 >>> Storm has existed as a healthy open source project for several years. >>> During that time, we have curated an open-source community successfully,= >>> attracting over 40 developers from a diverse group of companies includin= g >>> Twitter, Yahoo!, and Alibaba. >>>=20 >>> =3D=3D=3D Homogenous Developers =3D=3D=3D >>>=20 >>> The initial committers are employed by large companies (including >> Twitter, >>> Yahoo!, Alibaba, Microsoft) and well-funded startups. Storm has an activ= e >>> community of developers, and we are committed to recruiting additional >>> committers based on their contributions to the project. >>>=20 >>> =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D >>>=20 >>> It is expected that Storm development will occur on both salaried time >> and >>> on volunteer time, after hours. The majority of initial committers are >> paid >>> by their employer to contribute to this project. However, they are all >>> passionate about the project, and we are confident that the project will= >>> continue even if no salaried developers contribute to the project. We ar= e >>> committed to recruiting additional committers including non-salaried >>> developers. >>>=20 >>> =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D >>>=20 >>> As mentioned in the Alignment section, Storm is closely integrated with >>> Hadoop, >>> Zookeeper, Thrift, YARN and Mesos in a numerous ways. We look forward to= >>> collaborating with those communities, as well as other Apache communitie= s >>> (including Apache S4 which focuses on stateful low-latency processing). >>>=20 >>> =3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D >>>=20 >>> Storm is already a healthy and well known open source project. This >>> proposal is not for the purpose of generating publicity. Rather, the >>> primary benefits to joining Apache are those outlined in the Rationale >>> section. >>>=20 >>> =3D=3D Documentation =3D=3D >>>=20 >>> The reader will find these websites highly relevant: >>>=20 >>> * Storm website: http://storm-project.net >>> * Storm documentation: https://github.com/nathanmarz/storm/wiki >>> * Codebase: https://github.com/nathanmarz/storm >>> * User group: https://groups.google.com/group/storm-user >>>=20 >>> =3D=3D Source and Intellectual Property Submission Plan =3D=3D >>>=20 >>> The Storm codebase is currently hosted on Github: >>> https://github.com/nathanmarz/storm. >>>=20 >>> This is the exact codebase that we would migrate to the Apache >> foundation. >>>=20 >>> The Storm source code is currently licensed under Eclipse Public License= >>> Version 1.0. Some source code was contributed under a contributor >> agreement >>> based on the Sun contributor agreement (v1.5). More recent code has been= >>> contributed under an Apache style agreement (see >>> https://dl.dropboxusercontent.com/u/133901206/storm-apache-style-cla.txt= >> ). >>>=20 >>> Upon entering Apache, Storm will migrate to an Apache License 2.0 with >> all >>> contributions licensed to the Apache Foundation. In certain cases where >>> individuals or organizations hold copyright, we will ensure they grant a= >>> license to the Apache Foundation. Going forward, all commits will be >>> licensed directly to the Apache foundation through our signed Individual= >>> Contributor License Agreements for all committers on the project. >>>=20 >>> Yahoo! is also willing to move Storm-on-YARN code from github to be a >>> subproject of Apache Storm project. Storm-on-YARN is currently licensed >>> under Apache License 2.0 and receive contribution under Apache style CLA= . >>> Upon entering Apache, Yahoo! will sign over copyright to Apache >> foundation. >>>=20 >>> =3D=3D External Dependencies =3D=3D >>>=20 >>> To the best of our knowledge, all of Storm dependencies (except 0MQ/JMQ)= >>> are distributed under Apache compatible licenses. Upon acceptance to the= >>> incubator, we would begin a thorough analysis of all transitive >>> dependencies to verify this fact and introduce license checking into the= >>> build and release process (for instance integrating Apache Rat). >>>=20 >>> Storm has used 0MQ and JMQ as the default mechanism for internal >> messaging >>> layer, and 0MQ/JMQ is licensed under GNU Lesser General Public License. >>> Recently, we have made Storm messaging layer pluggable, and plan to use >>> Netty (which is licensed under Apache License v2) as our default >> messaging >>> plugin (while keep 0MQ as an optional plugin). >>>=20 >>> =3D=3D Cryptography =3D=3D >>>=20 >>> We do not expect Storm to be a controlled export item due to the use of >>> encryption. >>>=20 >>> Storm enable encryptions via 2 plugins: >>>=20 >>> * SASL authentication plugins =E2=80=A6 Currently, we have provide =E2=80= =9Cno-op=E2=80=9D >>> authentication and digest authentication. In near future, we will >> introduce >>> Kerberos authentication. >>> * Tuple payload serialization plugins =E2=80=A6 Storm provides plugins f= or >>> plain-object serialization and blowfish encryption. >>>=20 >>> =3D=3D Required Resources =3D=3D >>>=20 >>> =3D=3D=3D Mailing lists =3D=3D=3D >>>=20 >>> * storm-user >>> * storm-dev >>> * storm-private (with moderated subscriptions) >>>=20 >>> =3D=3D=3D Subversion Directory =3D=3D=3D >>>=20 >>> Git is the preferred source control system: git://git.apache.org/storm >>>=20 >>> =3D=3D=3D Issue Tracking =3D=3D=3D >>>=20 >>> JIRA Storm (STORM) >>>=20 >>> =3D=3D Initial Committers =3D=3D >>>=20 >>> * Nathan Marz >>> * James Xu >>> * Jason Jackson >>> * Andy Feng >>> * Flip Kromer >>> * David Lao >>> * P. Taylor Goetz >>>=20 >>> =3D=3D Affiliations =3D=3D >>>=20 >>> * Nathan Marz - Nathan=E2=80=99s Startup >>> * James Xu - Alibaba >>> * Jason Jackson - Twitter >>> * Andy Feng - Yahoo! >>> * Flip Kromer - Infochimps >>> * David Lao - Microsoft >>> * P. Taylor Goetz - Health Market Science >>>=20 >>> =3D=3D Sponsors =3D=3D >>>=20 >>> =3D=3D=3D Champion =3D=3D=3D >>>=20 >>> * Doug Cutting >>>=20 >>> =3D=3D=3D Nominated Mentors =3D=3D=3D >>>=20 >>> * Ted Dunning >>> * Arvind Prabhaker >>> * Devaraj Das >>>=20 >>> =3D=3D=3D Sponsoring Entity =3D=3D=3D >>>=20 >>> The Apache Incubator >>=20 >>=20 >>=20 >> -- >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >> Srinath Perera, Ph.D. >> Director, Research, WSO2 Inc. >> Visiting Faculty, University of Moratuwa >> Member, Apache Software Foundation >> Research Scientist, Lanka Software Foundation >> Blog: http://srinathsview.blogspot.com/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >>=20 --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org