Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4C54A200CB0 for ; Fri, 23 Jun 2017 14:09:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4AA7E160BD4; Fri, 23 Jun 2017 12:09:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1C193160BCA for ; Fri, 23 Jun 2017 14:09:06 +0200 (CEST) Received: (qmail 45967 invoked by uid 500); 23 Jun 2017 12:09:05 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 45955 invoked by uid 99); 23 Jun 2017 12:09:05 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Jun 2017 12:09:05 +0000 Received: from mail-vk0-f43.google.com (mail-vk0-f43.google.com [209.85.213.43]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 056601A0ACC for ; Fri, 23 Jun 2017 12:09:05 +0000 (UTC) Received: by mail-vk0-f43.google.com with SMTP id r126so4467494vkg.0 for ; Fri, 23 Jun 2017 05:09:04 -0700 (PDT) X-Gm-Message-State: AKS2vOzTPnrMAw7blx2d4fQF39O3bOvOv2axFSMooSx/xlqYJUKIfv/f w7cdl8FZkJdCgY/8upIqHdK6gpnJgQ== X-Received: by 10.31.82.1 with SMTP id g1mr2402670vkb.121.1498219742195; Fri, 23 Jun 2017 05:09:02 -0700 (PDT) MIME-Version: 1.0 References: <0B5EFEE4-0F9F-475C-8A1A-1C3433858122@gmail.com> <7B86462C-770A-43F0-8299-3FEE11036D29@apache.org> In-Reply-To: From: "John D. Ament" Date: Fri, 23 Jun 2017 12:08:51 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [VOTE] Heron to enter Apache Incubator To: general@incubator.apache.org, billgraham@gmail.com, Von Gosling Content-Type: multipart/alternative; boundary="001a114e56ee877ac405529f7720" archived-at: Fri, 23 Jun 2017 12:09:08 -0000 --001a114e56ee877ac405529f7720 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Bill, Would I be correct in understanding that Heron implements the same protocol as Storm, but the actual implementation is different? John On Fri, Jun 23, 2017 at 1:36 AM Bill Graham wrote: > It's grossly inaccurate to refer to Heron as a Storm fork. There are abou= t > 132k lines of code in the Heron codebase (plus 166k of codegen), of which > about 7k are to implement the Apache Storm API bindings to the Heron API. > > The Rationale section of the proposal discusses the Heron architecture, > which is a complete rewrite with little in common with Storm. The only > overlap is that Heron supports the Storm user API for ease of migration. > > The value of having multiple projects to solve a common need is that each > can foster innovation, collaboration and exchange of ideas in different > ways. This is not a new concept to Apache. You can look at the incubator > discussions around Accumulo vs HBase (two implementations of the BigTable > paper) for example, to see how two different approaches to a shared probl= em > can be a good thing. > > thanks, > Bill > > On Thu, Jun 22, 2017 at 6:45 PM, Von Gosling > wrote: > > > Hi, > > > > I will give +1(Non-binding), but, > > > > I have the similar question about so many streaming framework in the > > apache, how to develop community for themselves. > > > > > > > > > > Best Regards, > > Von Gosling > > > > > > > > =E5=9C=A8 2017=E5=B9=B46=E6=9C=8823=E6=97=A5=EF=BC=8C08:51=EF=BC=8CEdwa= rd Capriolo =E5=86=99=E9=81=93=EF=BC=9A > > > > I believe heron and storm should be merged back together. I do not see > the > > value of storm and a storm fork in the asf. > > > > On Thursday, June 22, 2017, Bill Graham wrote: > > > > Thanks Taylor for relaying these sentiments, especially the part about > the > > Heron website which is indeed poorly worded (I suspect this could have > been > > the result of internal docs being open-sourced). I've opened this pull > > request to update the language regarding Storm: > > > > https://github.com/twitter/heron/pull/1979 > > > > On Thu, Jun 22, 2017 at 12:21 PM, P. Taylor Goetz > > wrote: > > > > The Apache Storm PMC had a discussion regarding the Heron proposal. In > > > > the > > > > spirit of openness I wanted to bring some of the sentiments expressed i= n > > that discussion back to this list. Please note that I am paraphrasing > > > > from > > > > that discussion and attempting to relay opinions of the collective PMC, > > > > not > > > > necessarily that of any individual. > > > > * There is a general disappointment that the Heron community chose not = to > > engage with the Storm community and instead chose a separate path. > > * A majority of the PMC supports Heron=E2=80=99s incubation, though som= e felt it > > would result in unnecessary duplication of effort. > > * A majority of the PMC supports the two projects working closely > > together. A number of PMC members suggested the two projects merge in > > > > some > > > > way. > > * Many PMC members took issue some of the marketing language on the Her= on > > website, particularly Heron being billed as =E2=80=9Cthe direct success= or to > > > > Apache > > > > Storm=E2=80=9D and the prominent =E2=80=9CUpgrade from Storm=E2=80=9D l= inks. The main concern > > > > here > > > > was such phrasing has somewhat of a hostile tone and undermines the > > > > desire > > > > for better collaboration, as well as confusing users. > > > > One of my goals as a proposed mentor for Heron and a Storm PMC member i= s > > to address some of these concerns and encourage collaboration. As I > > mentioned to the Storm PMC on that thread, if there are ongoing concern= s > > from either the Storm PMC or the Heron PPMC about me acting as a mentor= , > > > > I > > > > would be willing to step down. > > > > +1 (binding) > > > > -Taylor > > > > On Jun 16, 2017, at 4:41 PM, Bill Graham > > > > wrote: > > > > > > Hi, > > > > Based on the discussion on the incubator mailing list[1] I would like > > > > to > > > > call a vote to add Heron to the Apache Incubator. > > > > The full proposal is available below, and is also available on the > > > > Apache > > > > Incubator wiki at: > > https://wiki.apache.org/incubator/HeronProposal > > > > Please vote: > > [ ] +1, bring Heron into Incubator > > [ ] -1, do not bring Heron into Incubator, because... > > > > The vote will open for 7 days until Friday June 23 at 14:00 PT. > > > > Thank you > > > > 1 - > > https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7 > > > > 786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E > > > > > > > > > > =3D Heron Proposal =3D > > > > =3D Abstract =3D > > Heron is a real-time, distributed, fault-tolerant stream processing > > > > engine > > > > initially developed by Twitter. > > > > =3D Proposal =3D > > > > Heron is a real-time stream processing engine built for high > > > > performance, > > > > ease of manageability, performance predictability and developer > > productivity[1]. We wish to develop a community around Heron to > > > > increase > > > > contributions and see Heron thrive in an open forum. > > > > =3D Background =3D > > > > Heron provides the ability for developers to compose directed acyclic > > graphs (DAGs) of real-time query execution logic (i.e. a topology) and > > submit the topology to execute on a pluggable job scheduling system > > > > (e.g., > > > > Apache Aurora, YARN, Marathon, etc). Users can employ either the native > > Heron API or the Apache Storm API to develop the topology. Heron > > > > supports > > > > the Storm API for ease of migration, but beyond that Heron=E2=80=99s > > > > architecture > > > > differs considerably from Storm=E2=80=99s. > > > > Users submit a topology to the scheduler using the Heron client, which > > > > uses > > > > the Heron binary libraries to deploy all daemons required to run and > > > > manage > > > > the topology. The topology therefore has no reliance on centrally > > > > managed > > > > Heron services, only on a generic job scheduling system, which lends > > > > itself > > > > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN > > > > (among > > > > others). > > > > The scheduler runs each topology as a job consisting of multiple > > containers. One of the containers runs the topology master, responsible > > > > for > > > > managing the topology. The remaining containers each runs a stream > > > > manager > > > > responsible for data routing, a metrics manager that collects and > > > > reports > > > > various metrics and a number of processes called Heron instances which > > > > run > > > > the user-defined logic on the stream of tuples. Parallelism is achieved > > > > via > > > > process-based isolation of Heron instances, which provides predictable > > performance while simplifying debugging. The containers are allocated > > > > and > > > > managed by the scheduler framework based on resource availability of > > > > nodes > > > > in the cluster. The metadata for the topology, such as the physical > > > > plan > > > > and execution details, are stored in the pluggable Heron State Manager > > (e.g. Apache ZooKeeper). > > > > =3D Rationale =3D > > > > Heron is a general-purpose, modular and extensible platform that can be > > leveraged to support common, real-time analytics use cases. There is an > > increasing demand for open-source, scalable real-time analytics > > > > systems. > > > > We > > > > believe that Heron can be leveraged by other organizations to build > > streaming applications that can benefit from its robustness, high > > performance, adaptability to cloud environments and ease of use. > > > > Moreover, > > > > we hope that open-sourcing Heron will help to further evolve the > > > > technology > > > > as the project attracts contributors with diverse backgrounds and areas > > > > of > > > > expertise. > > > > We believe the Apache foundation is a great fit as the long-term home > > > > for > > > > Heron, as it provides an established process for community-driven > > development and decision making by consensus. This is exactly the model > > > > we > > > > want for future Heron development. > > > > =3D Initial Goals =3D > > > > * Move the existing codebase, website, documentation, and mailing lists > > > > to > > > > Apache-hosted infrastructure. > > * Integrate with the Apache development process. > > * Ensure all dependencies are compliant with Apache License version > > > > 2.0. > > > > * Incrementally develop and release per Apache guidelines. > > > > =3D Current Status =3D > > > > Heron is a stable project used in production at Twitter since 2014 and > > > > open > > > > sourced under the ASL v2 license in 2016. The Heron source code is > > currently hosted at github.com (https://github.com/twitter/heron), > > > > which > > > > will seed the Apache git repository. > > > > =3D Meritocracy =3D > > > > By submitting this incubator proposal, we=E2=80=99re expressing our int= ent to > > > > build > > > > a diverse developer community around Heron that will conduct itself > > according to The Apache Way and use a meritocratic means of building > > > > it's > > > > committer base. Several companies and universities have already > > > > expressed > > > > interest in and contributed to Heron. Our goal is to grow the Heron > > community by encouraging open communication, contribution and > > > > participation > > > > of all types, and ensuring that contributors are recognized > > > > appropriately. > > > > > > =3D Community =3D > > > > Heron is currently being used by Twitter, Google, Machine Zone and > > ndustrial.io and has received significant contributions by Microsoft > > > > and > > > > Streamlio. By bringing Heron into the Apache ecosystem, we believe we > > > > can > > > > attract even more developers who are interested in creating real-time > > systems to build the project's contributor base. > > > > =3D=3D Core Developers =3D=3D > > > > Current core developers are engineers from Twitter, Google, Microsoft > > > > and > > > > Streamlio. > > > > =3D=3D Alignment =3D=3D > > > > Heron utilizes a number of Apache technologies. Heron leverages Apache > > ZooKeeper for coordination and has scheduler implementations to > > > > integrate > > > > with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache > > > > REEF) > > > > as well as spout implementations to integrate with Apache Kafka and > > > > metrics > > > > implementations to integrate with Scribe. Heron also implements the > > > > Apache > > > > Storm user-level API, which allows topologies written against Storm to > > > > run > > > > in Heron. We believe that having Heron at Apache will help further the > > growth of the streaming compute community, as well as encourage > > > > cooperation > > > > and developer cross pollination with other Apache projects. > > > > =3D Known Risks =3D > > > > =3D=3D Orphaned Products =3D=3D > > > > The risk of the Heron project being abandoned is minimal. It is used in > > production at Twitter and Google and other companies are evaluating or > > adopting it for production use. > > > > =3D=3D Inexperience with Open Source =3D=3D > > > > All of the core contributors to the project have considerable > > > > experience > > > > with open source software development. Bill Graham[2], Ashvin > > > > Agrawal[3] > > > > and Supun Kamburugamuve[4], committers on the project, are PMCs on > > > > other > > > > Apache projects and Bill and Ashvin have gone through the Apache > > > > incubator > > > > process. Twitter has already donated numerous projects to the ASF > > > > (e.g., > > > > Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be > > > > mentored > > > > by experienced ASF members that can help with any roadblocks. > > > > =3D=3D Homogenous Developers =3D=3D > > > > Initial committers come from 5 separate organizations. Our intention is > > increase the diversity of contributing developers and their > > > > affiliations. > > > > To date github contributions have come from approximately 50 > > > > contributors > > > > from outside the Twitter team. > > > > =3D=3D Reliance on Salaried Developers =3D=3D > > > > It is expected that Heron development will occur on both salaried time > > > > and > > > > on volunteer time. The majority of initial committers are paid by their > > employers to contribute to this project. We are committed to recruiting > > additional committers from other organizations as well as non-salaried > > committers to join project. > > > > =3D=3D Relationships with Other Apache Products =3D=3D > > > > As mentioned in the Alignment section, Heron implements the Apache > > > > Storm > > > > API and integrates with multiple Apache schedulers (Apache Mesos, > > > > Apache > > > > Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache > > Thrift. > > > > =3D=3D An Excessive Fascination with the Apache Brand =3D=3D > > > > Heron's popularity is growing in the streaming compute space and we are > > long time supporters of the Apache brand. This proposal is not for the > > purpose of generating publicity through. Rather, the primary benefits > > > > to > > > > joining Apache are those of community building and open decision making > > outlined in the Rationale section. > > > > =3D=3D Documentation =3D=3D > > > > This proposal exists online as > > http://wiki.apache.org/incubator/HeronProposal. Extensive > > > > documentation > > > > can > > > > be found on github at https://twitter.github.io/heron and the source > > > > code > > > > is well documented. > > > > =3D=3D Source and Intellectual Property Submission Plan =3D=3D > > > > The Heron codebase is currently hosted on Github: > > https://github.com/twitter/heron. During incubation, the codebase will > > > > be > > > > migrated to Apache infrastructure. The source code is already ASF 2.0 > > licensed. > > > > =3D=3D External Dependencies =3D=3D > > > > All external libraries have ASF 2.0 compatible licenses except for > > > > pylint. > > > > The pylint library is GPL licensed, but is only used for pre-build > > > > Python > > > > style checks and is neither bundled with, nor relied upon by, the Heron > > source or binary release artifacts. > > > > =3D=3D Cryptography =3D=3D > > > > Heron does not use any cryptography libraries. > > > > =3D Required Resources =3D > > > > =3D=3D Mailing lists =3D=3D > > > > * private@heron.incubator.apache.org (with moderated > > > > subscriptions) > > > > * dev@heron.incubator.apache.org > > * commits@heron.incubator.apache.org > > * user@heron.incubator.apache.org > > > > > > =3D=3D Subversion Directory =3D=3D > > > > Git is the preferred source control system: git://git.apache.org/heron > > > > =3D=3D Issue Tracking =3D=3D > > > > JIRA: Heron (HERON) > > > > =3D=3D Initial Committers =3D=3D > > > > * Andrew Jorgensen (andrew at andrewjorgensen dot com) > > * Ashvin Agrawal (ashvin at apache dot org)* > > * Avrilia Floratou (avrilia dot floratou at gmail dot com) > > * Bill Graham (billgraham at apache dot org)* > > * Brian Hatfield (bmhatfield at gmail dot com) > > * Chris Kellogg (cckellogg at gmail dot com) > > * Huijun Wu (huijun dot wu dot 2010 at gmail dot com) > > * Karthik Ramasamy (karthik at gmail dot com) > > * Maosong Fu (maosongfu at gmail dot com) > > * Neng Lu(freeneng at gmail dot com) > > * Runhang Li (obj dot runhang at gmail dot com) > > * Sanjeev Kulkarni (sanjeevrk at gmail dot com) > > * Supun Kamburugamuve (supun at apache dot org)* > > * Thomas Sun (tom dot ssf at gmail dot com) > > * Yaliang Wang (yaliang dot w dot wang at ieee dot org) > > > > =3D=3D Affiliations =3D=3D > > > > * Andrew Jorgensen (Google) > > * Ashvin Agrawal (Microsoft) > > * Avrilia Floratou (Microsoft) > > * Bill Graham (Twitter) > > * Brian Hatfield (Google) > > * Chris Kellogg (Twitter) > > * Huijun Wu (Twitter) > > * Karthik Ramasamy (Streamlio) > > * Maosong Fu (Twitter) > > * Neng Lu (Twitter) > > * Runhang Li (Twitter) > > * Sanjeev Kulkarni (Streamlio) > > * Supun Kamburugamuve (Indiana University) > > * Thomas Sun (Twitter) > > * Yaliang Wang (Twitter) > > > > =3D Sponsors =3D > > > > =3D=3D Champion =3D=3D > > > > * Julien Le Dem (julien at apache dot org) > > > > =3D=3D Nominated Mentors =3D=3D > > > > * Jake Farrell (jfarrell at apache dot org) > > * Jacques Nadeau (jacques at apache dot org) > > * Julien Le Dem (julien at apache dot org) > > * P. Taylor Goetz (ptgoetz at apache dot org) > > > > =3D=3D Sponsoring Entity =3D=3D > > > > The Apache Incubator > > > > =3D=3D Footnotes =3D=3D > > > > * 1 - Papers detailing Heron are available at > > http://dl.acm.org/citation.cfm?id=3D2742788 and > > http://sites.computer.org/debull/A15dec/p15.pdf. > > * 2 - http://home.apache.org/phonebook.html?uid=3Dbillgraham > > * 3 - http://home.apache.org/phonebook.html?uid=3Dashvin > > * 4 - http://home.apache.org/phonebook.html?uid=3Dsupun > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > > > > > > > > For additional commands, e-mail: general-help@incubator.apache.org > > > > > > > > > > > > > > > > > > -- > > Sorry this was sent from mobile. Will do less grammar and spell check > than > > usual. > > > > > > > --001a114e56ee877ac405529f7720--