Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4133B200CB6 for ; Thu, 15 Jun 2017 05:05:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3FCDA160BE8; Thu, 15 Jun 2017 03:05:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0FF2D160BDB for ; Thu, 15 Jun 2017 05:05:06 +0200 (CEST) Received: (qmail 51685 invoked by uid 500); 15 Jun 2017 03:05:05 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 51673 invoked by uid 99); 15 Jun 2017 03:05:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jun 2017 03:05:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 11E15C01DB for ; Thu, 15 Jun 2017 03:05:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id oow2M4RvWv9v for ; Thu, 15 Jun 2017 03:04:59 +0000 (UTC) Received: from mail-ua0-f181.google.com (mail-ua0-f181.google.com [209.85.217.181]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B67A65F20C for ; Thu, 15 Jun 2017 03:04:58 +0000 (UTC) Received: by mail-ua0-f181.google.com with SMTP id q15so1107091uaa.2 for ; Wed, 14 Jun 2017 20:04:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=IJyUkj1pnfrfi7BEJyjH5BsStm7BSda0quVZxi+TQss=; b=PRoWCEdDp+dO69YPlAp4B+5ApysW79nRsMgMnVwXvj20UzqCst0d3hqYv7w0WqIXW0 FuAj4qBFEK+E0hncjyOPz1W2UhWouloTEdcP9Z3yGf+bWSN4Tki2JG1NjVlFcPMSeNIP JqorcdR2qYPos3Aqm0okkPpiwm2bGNJ9n/UjwvIed10GOBI8crMxpbDik+gI0sOYxtvO S+Yn6+7p1oAtNVQGmjUkJMXSdZ73vG+1SG88NEDWHEETnkoy8zGZxYD2k+/ceRpp5kri fDMCJ44CMnRItbLVAEzwyE9u1ccxbCVYHe0mtpNcLWBLgmMEcItdWVAurxpD3pBgn7J0 GnIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=IJyUkj1pnfrfi7BEJyjH5BsStm7BSda0quVZxi+TQss=; b=qmE8WGOfV0NPPZnCnK0Lu2K8A4rwPUtyN/ul4ZESScMPoJeCIWkqIkgwwf5EhJPSva yntUcT71qfZ8bbWIFB2kN2o9vsM/Qsak7egZgtjyfPEb4jH+/CC8hIdAX5Jl01L4G0KY vFT1hWB3rUTXDemvH/TS9IE8ZvgUvuj81NlpSLqGmQO5C8JaHmuRYMDghqKlDMZNs9Ct wqqEtfZlR52ryBIktPPJYUqCcks5Q4HxX+V0olSYvYz4eaW4NsSlcfXQhyf8mjlP03rc gL8pmp9KYd+S95FqqK7Hl0FbijSAYAfp0ZKXQ+FuVzCqkEd38mc1haKD3jyAtCIh14Kz Qcbg== X-Gm-Message-State: AKS2vOysSTBSNoRMNO+wbQYu0ecqJmTc+N9Z8UF4U0wc9bg3vKKF4v/S fu1amuL9+emglq3prtpV20MRQXu4P1ek X-Received: by 10.176.5.2 with SMTP id 2mr2043272uax.132.1497495898117; Wed, 14 Jun 2017 20:04:58 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.45.134 with HTTP; Wed, 14 Jun 2017 20:04:57 -0700 (PDT) In-Reply-To: References: <1C1ACC96-AE46-476E-BBC1-3339374E72EF@gmail.com> From: William Markito Oliveira Date: Wed, 14 Jun 2017 22:04:57 -0500 Message-ID: Subject: Re: [PROPOSAL] Heron To: general@incubator.apache.org Content-Type: multipart/alternative; boundary="94eb2c12548a0f5cf90551f6ef05" archived-at: Thu, 15 Jun 2017 03:05:08 -0000 --94eb2c12548a0f5cf90551f6ef05 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Howdy! If Heron is looking for some help around incubation process, I'd love to help while Geode experience is still fresh in my mind and given that it's a project/space that I do have interest. Since I'm not an ASF member, I don't think I can offer to be a mentor, but can probably still help and participate on the process. Thanks! On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz wrote: > Hi Bill/Supun, > > Sorry for not being a little more clear. I was asking more about how the > Heron community would seek to engage with Storm community at the > *community* level as opposed to the technical level (i.e. =E2=80=9CCommun= ity over > Code=E2=80=9D). > > I=E2=80=99ve been asked by many why this has never happened, and have alw= ays > struggled to answer. Maybe you could help answer that question as well as > if and how that might change if Heron were to incubate. > > Another quick question: The proposal mentions Heron being used in > production at Google, but some Google employees I recently spoke to seeme= d > to contradict that. Could you explain? Note that=E2=80=99s nothing that w= ould > preclude the project from incubating, I=E2=80=99m just curious. > > -Taylor > > > On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve > wrote: > > > > Hi Taylor, > > > > For me, one of the interesting differences between Heron and Storm is t= he > > execution model. Storm uses a shared memory model while Heron uses a > > process based model. It will be interesting to see how these two evolve= . > > > > Thanks, > > Supun.. > > > > On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham > wrote: > > > >> Hi Taylor, > >> > >> Thanks for the mentor offer, we'd be glad to have your help. > >> > >> I think the best place for collaboration would be around the evolution > of > >> the API. In addition we plan to look more into DSL solutions which we > could > >> potentially collaborate on. This could be Trident, or Beam or somethin= g > >> else, but there could be synergies for future development here. > >> > >> thanks, > >> Bill > >> > >> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz > wrote: > >> > >>> Hi Bill, > >>> > >>> Could you comment on how/if the Heron community would be willing to > work > >>> with the Storm community? I've seen a number of new features in Storm > >> being > >>> ported to Heron, but I have yet to see any attempt by the Heron > community > >>> to engage with the Apache Storm community. > >>> > >>> I don't think it would be too far off to say that the relationship > >> between > >>> Heron and Apache Storm has been somewhat adversarial. The pre- and > >>> post-open sourcing marketing around Heron seemed, at least to me, > >> somewhat > >>> aggressively negative toward Storm. > >>> > >>> As a peer to Apache Storm, how would the proposed "Apache Heron" > >> community > >>> work to collaborate with the Storm community? If Heron is adopting AP= I > >>> changes in Storm, then it seems there is an opportunity for > >> collaboration. > >>> > >>> Don't take any of this as an objection to incubating the project. I > would > >>> support it. I would also be willing to be a mentor, if you would > consider > >>> taking on another. > >>> > >>> -Taylor > >>> > >>>> On Jun 8, 2017, at 1:23 PM, Bill Graham wrote= : > >>>> > >>>> Dear Apache Incubator Community, > >>>> > >>>> We are excited to share our proposal for discussion and feedback > >>>> for entering Apache Incubation. Heron is a real-time, distributed, > >>>> fault-tolerant stream processing engine. > >>>> > >>>> Our proposal can be found at https://wiki.apache.org/ > >>> incubator/HeronProposal > >>>> and is included below. > >>>> > >>>> > >>>> Thank you, > >>>> > >>>> Bill Graham on behalf of the Heron developers > >>>> > >>>> > >>>> # Heron Proposal > >>>> > >>>> ## Abstract > >>>> Heron is a real-time, distributed, fault-tolerant stream processing > >>> engine > >>>> initially developed by Twitter. > >>>> > >>>> ## Proposal > >>>> > >>>> Heron is a real-time stream processing engine built for high > >> performance, > >>>> ease of manageability, performance predictability and developer > >>>> productivity[1]. We wish to develop a community around Heron to > >> increase > >>>> contributions and see Heron thrive in an open forum. > >>>> > >>>> ## Background > >>>> > >>>> Heron provides the ability for developers to compose directed acycli= c > >>>> graphs (DAGs) of real-time query execution logic (i.e. a topology) a= nd > >>>> submit the topology to execute on a pluggable job scheduling system > >>> (e.g., > >>>> Apache Aurora, YARN, Marathon, etc). Users can employ either the > native > >>>> Heron API or the Apache Storm API to develop the topology. Heron > >> supports > >>>> the Storm API for ease of migration, but beyond that Heron=E2=80=99s > >> architecture > >>>> differs considerably from Storm=E2=80=99s. > >>>> > >>>> Users submit a topology to the scheduler using the Heron client, whi= ch > >>> uses > >>>> the Heron binary libraries to deploy all daemons required to run and > >>> manage > >>>> the topology. The topology therefore has no reliance on centrally > >> managed > >>>> Heron services, only on a generic job scheduling system, which lends > >>> itself > >>>> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN > >> (among > >>>> others). > >>>> > >>>> The scheduler runs each topology as a job consisting of multiple > >>>> containers. One of the containers runs the topology master, > responsible > >>> for > >>>> managing the topology. The remaining containers each runs a stream > >>> manager > >>>> responsible for data routing, a metrics manager that collects and > >> reports > >>>> various metrics and a number of processes called Heron instances whi= ch > >>> run > >>>> the user-defined logic on the stream of tuples. Parallelism is > achieved > >>> via > >>>> process-based isolation of Heron instances, which provides predictab= le > >>>> performance while simplifying debugging. The containers are allocate= d > >> and > >>>> managed by the scheduler framework based on resource availability of > >>> nodes > >>>> in the cluster. The metadata for the topology, such as the physical > >> plan > >>>> and execution details, are stored in the pluggable Heron State Manag= er > >>>> (e.g. Apache ZooKeeper). > >>>> > >>>> ## Rationale > >>>> > >>>> Heron is a general-purpose, modular and extensible platform that can > be > >>>> leveraged to support common, real-time analytics use cases. There is > an > >>>> increasing demand for open-source, scalable real-time analytics > >> systems. > >>> We > >>>> believe that Heron can be leveraged by other organizations to build > >>>> streaming applications that can benefit from its robustness, high > >>>> performance, adaptability to cloud environments and ease of use. > >>> Moreover, > >>>> we hope that open-sourcing Heron will help to further evolve the > >>> technology > >>>> as the project attracts contributors with diverse backgrounds and > areas > >>> of > >>>> expertise. > >>>> > >>>> We believe the Apache foundation is a great fit as the long-term hom= e > >> for > >>>> Heron, as it provides an established process for community-driven > >>>> development and decision making by consensus. This is exactly the > model > >>> we > >>>> want for future Heron development. > >>>> > >>>> ## Initial Goals > >>>> > >>>> * Move the existing codebase, website, documentation, and mailing > lists > >>> to > >>>> Apache-hosted infrastructure. > >>>> * Integrate with the Apache development process. > >>>> * Ensure all dependencies are compliant with Apache License version > >> 2.0. > >>>> * Incrementally develop and release per Apache guidelines. > >>>> > >>>> ## Current Status > >>>> > >>>> Heron is a stable project used in production at Twitter since 2014 a= nd > >>> open > >>>> sourced under the ASL v2 license in 2016. The Heron source code is > >>>> currently hosted at github.com (https://github.com/twitter/heron), > >> which > >>>> will seed the Apache git repository. > >>>> > >>>> ### Meritocracy > >>>> > >>>> By submitting this incubator proposal, we=E2=80=99re expressing our = intent to > >>> build > >>>> a diverse developer community around Heron that will conduct itself > >>>> according to The Apache Way and use a meritocratic means of building > >> it's > >>>> committer base. Several companies and universities have already > >> expressed > >>>> interest in and contributed to Heron. Our goal is to grow the Heron > >>>> community by encouraging open communication, contribution and > >>> participation > >>>> of all types, and ensuring that contributors are recognized > >>> appropriately. > >>>> > >>>> ### Community > >>>> > >>>> Heron is currently being used by Twitter, Google, Machine Zone and > >>>> ndustrial.io and has received significant contributions by Microsoft > >> and > >>>> Streamlio. By bringing Heron into the Apache ecosystem, we believe w= e > >> can > >>>> attract even more developers who are interested in creating real-tim= e > >>>> systems to build the project's contributor base. > >>>> > >>>> ### Core Developers > >>>> > >>>> Current core developers are engineers from Twitter, Google, Microsof= t > >> and > >>>> Streamlio. > >>>> > >>>> ### Alignment > >>>> > >>>> Heron utilizes a number of Apache technologies. Heron leverages Apac= he > >>>> ZooKeeper for coordination and has scheduler implementations to > >> integrate > >>>> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apach= e > >>> REEF) > >>>> as well as spout implementations to integrate with Apache Kafka and > >>> metrics > >>>> implementations to integrate with Scribe. Heron also implements the > >>> Apache > >>>> Storm user-level API, which allows topologies written against Storm = to > >>> run > >>>> in Heron. We believe that having Heron at Apache will help further t= he > >>>> growth of the streaming compute community, as well as encourage > >>> cooperation > >>>> and developer cross pollination with other Apache projects. > >>>> > >>>> ## Known Risks > >>>> > >>>> ### Orphaned Products > >>>> > >>>> The risk of the Heron project being abandoned is minimal. It is used > in > >>>> production at Twitter and Google and other companies are evaluating = or > >>>> adopting it for production use. > >>>> > >>>> ### Inexperience with Open Source > >>>> > >>>> All of the core contributors to the project have considerable > >> experience > >>>> with open source software development. Bill Graham[2], Ashvin > >> Agrawal[3] > >>>> and Supun Kamburugamuve[4], committers on the project, are PMCs on > >> other > >>>> Apache projects and Bill and Ashvin have gone through the Apache > >>> incubator > >>>> process. Twitter has already donated numerous projects to the ASF > >> (e.g., > >>>> Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be > >> mentored > >>>> by experienced ASF members that can help with any roadblocks. > >>>> > >>>> ### Homogenous Developers > >>>> > >>>> Initial committers come from 5 separate organizations. Our intention > is > >>>> increase the diversity of contributing developers and their > >> affiliations. > >>>> To date github contributions have come from approximately 50 > >> contributors > >>>> from outside the Twitter team. > >>>> > >>>> ### Reliance on Salaried Developers > >>>> > >>>> It is expected that Heron development will occur on both salaried ti= me > >>> and > >>>> on volunteer time. The majority of initial committers are paid by > their > >>>> employers to contribute to this project. We are committed to > recruiting > >>>> additional committers from other organizations as well as non-salari= ed > >>>> committers to join project. > >>>> > >>>> ### Relationships with Other Apache Products > >>>> > >>>> As mentioned in the Alignment section, Heron implements the Apache > >> Storm > >>>> API and integrates with multiple Apache schedulers (Apache Mesos, > >> Apache > >>>> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and > Apache > >>>> Thrift. > >>>> > >>>> ### An Excessive Fascination with the Apache Brand > >>>> > >>>> Heron's popularity is growing in the streaming compute space and we > are > >>>> long time supporters of the Apache brand. This proposal is not for t= he > >>>> purpose of generating publicity through. Rather, the primary benefit= s > >> to > >>>> joining Apache are those of community building and open decision > making > >>>> outlined in the Rationale section. > >>>> > >>>> ## Documentation > >>>> > >>>> This proposal exists online as http://wiki.apache.org/ > >>>> incubator/HeronProposal. Extensive documentation can be found on > github > >>> at > >>>> https://twitter.github.io/heron and the source code is well > >> documented. > >>>> > >>>> ## Source and Intellectual Property Submission Plan > >>>> > >>>> The Heron codebase is currently hosted on Github: > >>>> https://github.com/twitter/heron. During incubation, the codebase > will > >>> be > >>>> migrated to Apache infrastructure. The source code is already ASF 2.= 0 > >>>> licensed. > >>>> > >>>> ## External Dependencies > >>>> > >>>> All external libraries have ASF 2.0 compatible licenses except for > >>> pylint. > >>>> The pylint library is GPL licensed, but is only used for pre-build > >> Python > >>>> style checks and is neither bundled with, nor relied upon by, the > Heron > >>>> source or binary release artifacts. > >>>> > >>>> ## Cryptography > >>>> > >>>> Heron does not use any cryptography libraries. > >>>> > >>>> ## Required Resources > >>>> > >>>> ### Mailing lists > >>>> > >>>> private@heron.incubator.apache.org (with moderated subscriptions) > >>>> dev@heron.incubator.apache.org > >>>> commits@heron.incubator.apache.org > >>>> user@heron.incubator.apache.org > >>>> > >>>> ## Subversion Directory > >>>> > >>>> Git is the preferred source control system: git:// > git.apache.org/heron > >>>> > >>>> ## Issue Tracking > >>>> > >>>> JIRA: Heron (HERON) > >>>> > >>>> ## Initial Committers > >>>> > >>>> * Andrew Jorgensen (andrew at andrewjorgensen dot com) > >>>> * Ashvin Agrawal (ashvin at apache dot org)* > >>>> * Avrilia Floratou (avrilia dot floratou at gmail dot com) > >>>> * Bill Graham (billgraham at apache dot org)* > >>>> * Brian Hatfield (bmhatfield at gmail dot com) > >>>> * Chris Kellogg (cckellogg at gmail dot com) > >>>> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com) > >>>> * Karthik Ramasamy (karthik at gmail dot com) > >>>> * Maosong Fu (maosongfu at gmail dot com) > >>>> * Neng Lu(freeneng at gmail dot com) > >>>> * Runhang Li (obj dot runhang at gmail dot com) > >>>> * Sanjeev Kulkarni (sanjeevrk at gmail dot com) > >>>> * Supun Kamburugamuve (supun at apache dot org)* > >>>> * Thomas Sun (tom dot ssf at gmail dot com) > >>>> * Yaliang Wang (yaliang dot w dot wang at ieee dot org) > >>>> > >>>> ## Affiliations > >>>> > >>>> * Andrew Jorgensen (Google) > >>>> * Ashvin Agrawal (Microsoft) > >>>> * Avrilia Floratou (Microsoft) > >>>> * Bill Graham (Twitter) > >>>> * Brian Hatfield (Google) > >>>> * Chris Kellogg (Twitter) > >>>> * Huijun Wu (Twitter) > >>>> * Karthik Ramasamy (Streamlio) > >>>> * Maosong Fu (Twitter) > >>>> * Neng Lu (Twitter) > >>>> * Runhang Li (Twitter) > >>>> * Sanjeev Kulkarni (Streamlio) > >>>> * Supun Kamburugamuve (Indiana University) > >>>> * Thomas Sun (Twitter) > >>>> * Yaliang Wang (Twitter) > >>>> > >>>> ## Sponsors > >>>> > >>>> ### Champion > >>>> > >>>> * Julien Le Dem (julien at apache dot org) > >>>> > >>>> ### Nominated Mentors > >>>> > >>>> * Jake Farrell (jfarrell at apache dot org) > >>>> * Jacques Nadeau (jacques at apache dot org) > >>>> * Julien Le Dem (julien at apache dot org) > >>>> > >>>> ### Sponsoring Entity > >>>> > >>>> The Apache Incubator > >>>> > >>>> ### Footnotes > >>>> > >>>> 1 - Papers detailing Heron are available at > http://dl.acm.org/citation > >> . > >>>> cfm?id=3D2742788 and http://sites.computer.org/debull/A15dec/p15.pdf= . > >>>> 2 - http://home.apache.org/phonebook.html?uid=3Dbillgraham > >>>> 3 - http://home.apache.org/phonebook.html?uid=3Dashvin > >>>> 4 - http://home.apache.org/phonebook.html?uid=3Dsupun > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > >>> For additional commands, e-mail: general-help@incubator.apache.org > >>> > >>> > >> > > > > > > > > -- > > Supun Kamburugamuve > > Member, Apache Software Foundation; http://www.apache.org > > E-mail: supun@apache.o rg; Mobile: +1 812 219 2563 > > <(812)%20219-2563> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > For additional commands, e-mail: general-help@incubator.apache.org > > --=20 ~/William --94eb2c12548a0f5cf90551f6ef05--