incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Supun Kamburugamuve <supu...@gmail.com>
Subject Re: [PROPOSAL] Heron
Date Wed, 14 Jun 2017 14:35:46 GMT
Hi Taylor,

For me, one of the interesting differences between Heron and Storm is the
execution model. Storm uses a shared memory model while Heron uses a
process based model. It will be interesting to see how these two evolve.

Thanks,
Supun..

On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham <billgraham@gmail.com> wrote:

> Hi Taylor,
>
> Thanks for the mentor offer, we'd be glad to have your help.
>
> I think the best place for collaboration would be around the evolution of
> the API. In addition we plan to look more into DSL solutions which we could
> potentially collaborate on. This could be Trident, or Beam or something
> else, but there could be synergies for future development here.
>
> thanks,
> Bill
>
> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz <ptgoetz@gmail.com> wrote:
>
> > Hi Bill,
> >
> > Could you comment on how/if the Heron community would be willing to work
> > with the Storm community? I've seen a number of new features in Storm
> being
> > ported to Heron, but I have yet to see any attempt by the Heron community
> > to engage with the Apache Storm community.
> >
> > I don't think it would be too far off to say that the relationship
> between
> > Heron and Apache Storm has been somewhat adversarial. The pre- and
> > post-open sourcing marketing around Heron seemed, at least to me,
> somewhat
> > aggressively negative toward Storm.
> >
> > As a peer to Apache Storm, how would the proposed "Apache Heron"
> community
> > work to collaborate with the Storm community? If Heron is adopting API
> > changes in Storm, then it seems there is an opportunity for
> collaboration.
> >
> > Don't take any of this as an objection to incubating the project. I would
> > support it. I would also be willing to be a mentor, if you would consider
> > taking on another.
> >
> > -Taylor
> >
> > > On Jun 8, 2017, at 1:23 PM, Bill Graham <billgraham@gmail.com> wrote:
> > >
> > > Dear Apache Incubator Community,
> > >
> > > We are excited to share our proposal for discussion and feedback
> > > for entering Apache Incubation. Heron is a real-time, distributed,
> > > fault-tolerant stream processing engine.
> > >
> > > Our proposal can be found at https://wiki.apache.org/
> > incubator/HeronProposal
> > > and is included below.
> > >
> > >
> > > Thank you,
> > >
> > > Bill Graham on behalf of the Heron developers
> > >
> > >
> > > # Heron Proposal
> > >
> > > ## Abstract
> > > Heron is a real-time, distributed, fault-tolerant stream processing
> > engine
> > > initially developed by Twitter.
> > >
> > > ## Proposal
> > >
> > > Heron is a real-time stream processing engine built for high
> performance,
> > > ease of manageability, performance predictability and developer
> > > productivity[1]. We wish to develop a community around Heron to
> increase
> > > contributions and see Heron thrive in an open forum.
> > >
> > > ## Background
> > >
> > > Heron provides the ability for developers to compose directed acyclic
> > > graphs (DAGs) of real-time query execution logic (i.e. a topology) and
> > > submit the topology to execute on a pluggable job scheduling system
> > (e.g.,
> > > Apache Aurora, YARN, Marathon, etc). Users can employ either the native
> > > Heron API or the Apache Storm API to develop the topology. Heron
> supports
> > > the Storm API for ease of migration, but beyond that Heron’s
> architecture
> > > differs considerably from Storm’s.
> > >
> > > Users submit a topology to the scheduler using the Heron client, which
> > uses
> > > the Heron binary libraries to deploy all daemons required to run and
> > manage
> > > the topology. The topology therefore has no reliance on centrally
> managed
> > > Heron services, only on a generic job scheduling system, which lends
> > itself
> > > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN
> (among
> > > others).
> > >
> > > The scheduler runs each topology as a job consisting of multiple
> > > containers. One of the containers runs the topology master, responsible
> > for
> > > managing the topology. The remaining containers each runs a stream
> > manager
> > > responsible for data routing, a metrics manager that collects and
> reports
> > > various metrics and a number of processes called Heron instances which
> > run
> > > the user-defined logic on the stream of tuples. Parallelism is achieved
> > via
> > > process-based isolation of Heron instances, which provides predictable
> > > performance while simplifying debugging. The containers are allocated
> and
> > > managed by the scheduler framework based on resource availability of
> > nodes
> > > in the cluster. The metadata for the topology, such as the physical
> plan
> > > and execution details, are stored in the pluggable Heron State Manager
> > > (e.g. Apache ZooKeeper).
> > >
> > > ## Rationale
> > >
> > > Heron is a general-purpose, modular and extensible platform that can be
> > > leveraged to support common, real-time analytics use cases. There is an
> > > increasing demand for open-source, scalable real-time analytics
> systems.
> > We
> > > believe that Heron can be leveraged by other organizations to build
> > > streaming applications that can benefit from its robustness, high
> > > performance, adaptability to cloud environments and ease of use.
> > Moreover,
> > > we hope that open-sourcing Heron will help to further evolve the
> > technology
> > > as the project attracts contributors with diverse backgrounds and areas
> > of
> > > expertise.
> > >
> > > We believe the Apache foundation is a great fit as the long-term home
> for
> > > Heron, as it provides an established process for community-driven
> > > development and decision making by consensus. This is exactly the model
> > we
> > > want for future Heron development.
> > >
> > > ## Initial Goals
> > >
> > > * Move the existing codebase, website, documentation, and mailing lists
> > to
> > > Apache-hosted infrastructure.
> > > * Integrate with the Apache development process.
> > > * Ensure all dependencies are compliant with Apache License version
> 2.0.
> > > * Incrementally develop and release per Apache guidelines.
> > >
> > > ## Current Status
> > >
> > > Heron is a stable project used in production at Twitter since 2014 and
> > open
> > > sourced under the ASL v2 license in 2016. The Heron source code is
> > > currently hosted at github.com (https://github.com/twitter/heron),
> which
> > > will seed the Apache git repository.
> > >
> > > ### Meritocracy
> > >
> > > By submitting this incubator proposal, we’re expressing our intent to
> > build
> > > a diverse developer community around Heron that will conduct itself
> > > according to The Apache Way and use a meritocratic means of building
> it's
> > > committer base. Several companies and universities have already
> expressed
> > > interest in and contributed to Heron. Our goal is to grow the Heron
> > > community by encouraging open communication, contribution and
> > participation
> > > of all types, and ensuring that contributors are recognized
> > appropriately.
> > >
> > > ### Community
> > >
> > > Heron is currently being used by Twitter, Google, Machine Zone and
> > > ndustrial.io and has received significant contributions by Microsoft
> and
> > > Streamlio. By bringing Heron into the Apache ecosystem, we believe we
> can
> > > attract even more developers who are interested in creating real-time
> > > systems to build the project's contributor base.
> > >
> > > ### Core Developers
> > >
> > > Current core developers are engineers from Twitter, Google, Microsoft
> and
> > > Streamlio.
> > >
> > > ### Alignment
> > >
> > > Heron utilizes a number of Apache technologies. Heron leverages Apache
> > > ZooKeeper for coordination and has scheduler implementations to
> integrate
> > > with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache
> > REEF)
> > > as well as spout implementations to integrate with Apache Kafka and
> > metrics
> > > implementations to integrate with Scribe. Heron also implements the
> > Apache
> > > Storm user-level API, which allows topologies written against Storm to
> > run
> > > in Heron. We believe that having Heron at Apache will help further the
> > > growth of the streaming compute community, as well as encourage
> > cooperation
> > > and developer cross pollination with other Apache projects.
> > >
> > > ## Known Risks
> > >
> > > ### Orphaned Products
> > >
> > > The risk of the Heron project being abandoned is minimal. It is used in
> > > production at Twitter and Google and other companies are evaluating or
> > > adopting it for production use.
> > >
> > > ### Inexperience with Open Source
> > >
> > > All of the core contributors to the project have considerable
> experience
> > > with open source software development. Bill Graham[2], Ashvin
> Agrawal[3]
> > > and Supun Kamburugamuve[4], committers on the project, are PMCs on
> other
> > > Apache projects and Bill and Ashvin have gone through the Apache
> > incubator
> > > process. Twitter has already donated numerous projects to the ASF
> (e.g.,
> > > Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be
> mentored
> > > by experienced ASF members that can help with any roadblocks.
> > >
> > > ### Homogenous Developers
> > >
> > > Initial committers come from 5 separate organizations. Our intention is
> > > increase the diversity of contributing developers and their
> affiliations.
> > > To date github contributions have come from approximately 50
> contributors
> > > from outside the Twitter team.
> > >
> > > ### Reliance on Salaried Developers
> > >
> > > It is expected that Heron development will occur on both salaried time
> > and
> > > on volunteer time. The majority of initial committers are paid by their
> > > employers to contribute to this project. We are committed to recruiting
> > > additional committers from other organizations as well as non-salaried
> > > committers to join project.
> > >
> > > ### Relationships with Other Apache Products
> > >
> > > As mentioned in the Alignment section, Heron implements the Apache
> Storm
> > > API and integrates with multiple Apache schedulers (Apache Mesos,
> Apache
> > > Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache
> > > Thrift.
> > >
> > > ### An Excessive Fascination with the Apache Brand
> > >
> > > Heron's popularity is growing in the streaming compute space and we are
> > > long time supporters of the Apache brand. This proposal is not for the
> > > purpose of generating publicity through. Rather, the primary benefits
> to
> > > joining Apache are those of community building and open decision making
> > > outlined in the Rationale section.
> > >
> > > ## Documentation
> > >
> > > This proposal exists online as http://wiki.apache.org/
> > > incubator/HeronProposal. Extensive documentation can be found on github
> > at
> > > https://twitter.github.io/heron and the source code is well
> documented.
> > >
> > > ## Source and Intellectual Property Submission Plan
> > >
> > > The Heron codebase is currently hosted on Github:
> > > https://github.com/twitter/heron. During incubation, the codebase will
> > be
> > > migrated to Apache infrastructure. The source code is already ASF 2.0
> > > licensed.
> > >
> > > ## External Dependencies
> > >
> > > All external libraries have ASF 2.0 compatible licenses except for
> > pylint.
> > > The pylint library is GPL licensed, but is only used for pre-build
> Python
> > > style checks and is neither bundled with, nor relied upon by, the Heron
> > > source or binary release artifacts.
> > >
> > > ## Cryptography
> > >
> > > Heron does not use any cryptography libraries.
> > >
> > > ## Required Resources
> > >
> > > ### Mailing lists
> > >
> > > private@heron.incubator.apache.org (with moderated subscriptions)
> > > dev@heron.incubator.apache.org
> > > commits@heron.incubator.apache.org
> > > user@heron.incubator.apache.org
> > >
> > > ## Subversion Directory
> > >
> > > Git is the preferred source control system: git://git.apache.org/heron
> > >
> > > ## Issue Tracking
> > >
> > > JIRA: Heron (HERON)
> > >
> > > ## Initial Committers
> > >
> > > * Andrew Jorgensen (andrew at andrewjorgensen dot com)
> > > * Ashvin Agrawal (ashvin at apache dot org)*
> > > * Avrilia Floratou (avrilia dot floratou at gmail dot com)
> > > * Bill Graham (billgraham at apache dot org)*
> > > * Brian Hatfield (bmhatfield at gmail dot com)
> > > * Chris Kellogg (cckellogg at gmail dot com)
> > > * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
> > > * Karthik Ramasamy (karthik at gmail dot com)
> > > * Maosong Fu (maosongfu at gmail dot com)
> > > * Neng Lu(freeneng at gmail dot com)
> > > * Runhang Li (obj dot runhang at gmail dot com)
> > > * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
> > > * Supun Kamburugamuve (supun at apache dot org)*
> > > * Thomas Sun (tom dot ssf at gmail dot com)
> > > * Yaliang Wang (yaliang dot w dot wang at ieee dot org)
> > >
> > > ## Affiliations
> > >
> > > * Andrew Jorgensen (Google)
> > > * Ashvin Agrawal (Microsoft)
> > > * Avrilia Floratou (Microsoft)
> > > * Bill Graham (Twitter)
> > > * Brian Hatfield (Google)
> > > * Chris Kellogg (Twitter)
> > > * Huijun Wu (Twitter)
> > > * Karthik Ramasamy (Streamlio)
> > > * Maosong Fu (Twitter)
> > > * Neng Lu (Twitter)
> > > * Runhang Li (Twitter)
> > > * Sanjeev Kulkarni (Streamlio)
> > > * Supun Kamburugamuve (Indiana University)
> > > * Thomas Sun (Twitter)
> > > * Yaliang Wang (Twitter)
> > >
> > > ## Sponsors
> > >
> > > ### Champion
> > >
> > > * Julien Le Dem (julien at apache dot org)
> > >
> > > ### Nominated Mentors
> > >
> > > * Jake Farrell (jfarrell at apache dot org)
> > > * Jacques Nadeau (jacques at apache dot org)
> > > * Julien Le Dem (julien at apache dot org)
> > >
> > > ### Sponsoring Entity
> > >
> > > The Apache Incubator
> > >
> > > ### Footnotes
> > >
> > > 1 - Papers detailing Heron are available at http://dl.acm.org/citation
> .
> > > cfm?id=2742788 and http://sites.computer.org/debull/A15dec/p15.pdf.
> > > 2 - http://home.apache.org/phonebook.html?uid=billgraham
> > > 3 - http://home.apache.org/phonebook.html?uid=ashvin
> > > 4 - http://home.apache.org/phonebook.html?uid=supun
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>



-- 
Supun Kamburugamuve
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun@apache.o <supun06@gmail.com>rg;  Mobile: +1 812 219 2563
<(812)%20219-2563>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message