incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashvin A <aas...@gmail.com>
Subject Re: [PROPOSAL] Heron
Date Thu, 15 Jun 2017 20:48:13 GMT
On Thu, Jun 15, 2017 at 1:24 PM, Bill Graham <billgraham@gmail.com> wrote:

...

One of our goals during incubation will be to use open forums of
> communication, like the Apache mailing lists, and work to foster a truly
> collaborative environment for both Apache Storm and Heron community members
> to work within together.
>
>
+1



>
> On Thu, Jun 15, 2017 at 10:42 AM, Debo Dutta (dedutta) <dedutta@cisco.com>
> wrote:
>
> > Am happy to help too!
> >
> > Thx
> > Debo
> >
> > Sent from my iPhone
> >
> > > On Jun 14, 2017, at 8:05 PM, William Markito Oliveira <
> > william.markito@gmail.com> wrote:
> > >
> > > Howdy!
> > >
> > > If Heron is looking for some help around incubation process, I'd love
> to
> > > help while Geode experience is still fresh in my mind and given that
> > it's a
> > > project/space that I do have interest. Since I'm not an ASF member, I
> > don't
> > > think I can offer to be a mentor, but can probably still help and
> > > participate on the process.
> > >
> > > Thanks!
> > >
> > >> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz <ptgoetz@gmail.com>
> > wrote:
> > >>
> > >> Hi Bill/Supun,
> > >>
> > >> Sorry for not being a little more clear. I was asking more about how
> the
> > >> Heron community would seek to engage with Storm community at the
> > >> *community* level as opposed to the technical level (i.e. “Community
> > over
> > >> Code”).
> > >>
> > >> I’ve been asked by many why this has never happened, and have always
> > >> struggled to answer. Maybe you could help answer that question as well
> > as
> > >> if and how that might change if Heron were to incubate.
> > >>
> > >> Another quick question: The proposal mentions Heron being used in
> > >> production at Google, but some Google employees I recently spoke to
> > seemed
> > >> to contradict that. Could you explain? Note that’s nothing that would
> > >> preclude the project from incubating, I’m just curious.
> > >>
> > >> -Taylor
> > >>
> > >>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve <supun06@gmail.com>
> > >> wrote:
> > >>>
> > >>> Hi Taylor,
> > >>>
> > >>> For me, one of the interesting differences between Heron and Storm
is
> > the
> > >>> execution model. Storm uses a shared memory model while Heron uses
a
> > >>> process based model. It will be interesting to see how these two
> > evolve.
> > >>>
> > >>> Thanks,
> > >>> Supun..
> > >>>
> > >>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham <billgraham@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Hi Taylor,
> > >>>>
> > >>>> Thanks for the mentor offer, we'd be glad to have your help.
> > >>>>
> > >>>> I think the best place for collaboration would be around the
> evolution
> > >> of
> > >>>> the API. In addition we plan to look more into DSL solutions which
> we
> > >> could
> > >>>> potentially collaborate on. This could be Trident, or Beam or
> > something
> > >>>> else, but there could be synergies for future development here.
> > >>>>
> > >>>> thanks,
> > >>>> Bill
> > >>>>
> > >>>> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz <ptgoetz@gmail.com>
> > >> wrote:
> > >>>>
> > >>>>> Hi Bill,
> > >>>>>
> > >>>>> Could you comment on how/if the Heron community would be willing
to
> > >> work
> > >>>>> with the Storm community? I've seen a number of new features
in
> Storm
> > >>>> being
> > >>>>> ported to Heron, but I have yet to see any attempt by the Heron
> > >> community
> > >>>>> to engage with the Apache Storm community.
> > >>>>>
> > >>>>> I don't think it would be too far off to say that the relationship
> > >>>> between
> > >>>>> Heron and Apache Storm has been somewhat adversarial. The pre-
and
> > >>>>> post-open sourcing marketing around Heron seemed, at least
to me,
> > >>>> somewhat
> > >>>>> aggressively negative toward Storm.
> > >>>>>
> > >>>>> As a peer to Apache Storm, how would the proposed "Apache Heron"
> > >>>> community
> > >>>>> work to collaborate with the Storm community? If Heron is adopting
> > API
> > >>>>> changes in Storm, then it seems there is an opportunity for
> > >>>> collaboration.
> > >>>>>
> > >>>>> Don't take any of this as an objection to incubating the project.
I
> > >> would
> > >>>>> support it. I would also be willing to be a mentor, if you
would
> > >> consider
> > >>>>> taking on another.
> > >>>>>
> > >>>>> -Taylor
> > >>>>>
> > >>>>>> On Jun 8, 2017, at 1:23 PM, Bill Graham <billgraham@gmail.com>
> > wrote:
> > >>>>>>
> > >>>>>> Dear Apache Incubator Community,
> > >>>>>>
> > >>>>>> We are excited to share our proposal for discussion and
feedback
> > >>>>>> for entering Apache Incubation. Heron is a real-time, distributed,
> > >>>>>> fault-tolerant stream processing engine.
> > >>>>>>
> > >>>>>> Our proposal can be found at https://wiki.apache.org/
> > >>>>> incubator/HeronProposal
> > >>>>>> and is included below.
> > >>>>>>
> > >>>>>>
> > >>>>>> Thank you,
> > >>>>>>
> > >>>>>> Bill Graham on behalf of the Heron developers
> > >>>>>>
> > >>>>>>
> > >>>>>> # Heron Proposal
> > >>>>>>
> > >>>>>> ## Abstract
> > >>>>>> Heron is a real-time, distributed, fault-tolerant stream
> processing
> > >>>>> engine
> > >>>>>> initially developed by Twitter.
> > >>>>>>
> > >>>>>> ## Proposal
> > >>>>>>
> > >>>>>> Heron is a real-time stream processing engine built for
high
> > >>>> performance,
> > >>>>>> ease of manageability, performance predictability and developer
> > >>>>>> productivity[1]. We wish to develop a community around
Heron to
> > >>>> increase
> > >>>>>> contributions and see Heron thrive in an open forum.
> > >>>>>>
> > >>>>>> ## Background
> > >>>>>>
> > >>>>>> Heron provides the ability for developers to compose directed
> > acyclic
> > >>>>>> graphs (DAGs) of real-time query execution logic (i.e.
a topology)
> > and
> > >>>>>> submit the topology to execute on a pluggable job scheduling
> system
> > >>>>> (e.g.,
> > >>>>>> Apache Aurora, YARN, Marathon, etc). Users can employ either
the
> > >> native
> > >>>>>> Heron API or the Apache Storm API to develop the topology.
Heron
> > >>>> supports
> > >>>>>> the Storm API for ease of migration, but beyond that Heron’s
> > >>>> architecture
> > >>>>>> differs considerably from Storm’s.
> > >>>>>>
> > >>>>>> Users submit a topology to the scheduler using the Heron
client,
> > which
> > >>>>> uses
> > >>>>>> the Heron binary libraries to deploy all daemons required
to run
> and
> > >>>>> manage
> > >>>>>> the topology. The topology therefore has no reliance on
centrally
> > >>>> managed
> > >>>>>> Heron services, only on a generic job scheduling system,
which
> lends
> > >>>>> itself
> > >>>>>> well to be run on top of Apache Aurora/Mesos or Apache
Hadoop/YARN
> > >>>> (among
> > >>>>>> others).
> > >>>>>>
> > >>>>>> The scheduler runs each topology as a job consisting of
multiple
> > >>>>>> containers. One of the containers runs the topology master,
> > >> responsible
> > >>>>> for
> > >>>>>> managing the topology. The remaining containers each runs
a stream
> > >>>>> manager
> > >>>>>> responsible for data routing, a metrics manager that collects
and
> > >>>> reports
> > >>>>>> various metrics and a number of processes called Heron
instances
> > which
> > >>>>> run
> > >>>>>> the user-defined logic on the stream of tuples. Parallelism
is
> > >> achieved
> > >>>>> via
> > >>>>>> process-based isolation of Heron instances, which provides
> > predictable
> > >>>>>> performance while simplifying debugging. The containers
are
> > allocated
> > >>>> and
> > >>>>>> managed by the scheduler framework based on resource availability
> of
> > >>>>> nodes
> > >>>>>> in the cluster. The metadata for the topology, such as
the
> physical
> > >>>> plan
> > >>>>>> and execution details, are stored in the pluggable Heron
State
> > Manager
> > >>>>>> (e.g. Apache ZooKeeper).
> > >>>>>>
> > >>>>>> ## Rationale
> > >>>>>>
> > >>>>>> Heron is a general-purpose, modular and extensible platform
that
> can
> > >> be
> > >>>>>> leveraged to support common, real-time analytics use cases.
There
> is
> > >> an
> > >>>>>> increasing demand for open-source, scalable real-time analytics
> > >>>> systems.
> > >>>>> We
> > >>>>>> believe that Heron can be leveraged by other organizations
to
> build
> > >>>>>> streaming applications that can benefit from its robustness,
high
> > >>>>>> performance, adaptability to cloud environments and ease
of use.
> > >>>>> Moreover,
> > >>>>>> we hope that open-sourcing Heron will help to further evolve
the
> > >>>>> technology
> > >>>>>> as the project attracts contributors with diverse backgrounds
and
> > >> areas
> > >>>>> of
> > >>>>>> expertise.
> > >>>>>>
> > >>>>>> We believe the Apache foundation is a great fit as the
long-term
> > home
> > >>>> for
> > >>>>>> Heron, as it provides an established process for community-driven
> > >>>>>> development and decision making by consensus. This is exactly
the
> > >> model
> > >>>>> we
> > >>>>>> want for future Heron development.
> > >>>>>>
> > >>>>>> ## Initial Goals
> > >>>>>>
> > >>>>>> * Move the existing codebase, website, documentation, and
mailing
> > >> lists
> > >>>>> to
> > >>>>>> Apache-hosted infrastructure.
> > >>>>>> * Integrate with the Apache development process.
> > >>>>>> * Ensure all dependencies are compliant with Apache License
> version
> > >>>> 2.0.
> > >>>>>> * Incrementally develop and release per Apache guidelines.
> > >>>>>>
> > >>>>>> ## Current Status
> > >>>>>>
> > >>>>>> Heron is a stable project used in production at Twitter
since 2014
> > and
> > >>>>> open
> > >>>>>> sourced under the ASL v2 license in 2016. The Heron source
code is
> > >>>>>> currently hosted at github.com (https://github.com/twitter/heron
> ),
> > >>>> which
> > >>>>>> will seed the Apache git repository.
> > >>>>>>
> > >>>>>> ### Meritocracy
> > >>>>>>
> > >>>>>> By submitting this incubator proposal, we’re expressing
our intent
> > to
> > >>>>> build
> > >>>>>> a diverse developer community around Heron that will conduct
> itself
> > >>>>>> according to The Apache Way and use a meritocratic means
of
> building
> > >>>> it's
> > >>>>>> committer base. Several companies and universities have
already
> > >>>> expressed
> > >>>>>> interest in and contributed to Heron. Our goal is to grow
the
> Heron
> > >>>>>> community by encouraging open communication, contribution
and
> > >>>>> participation
> > >>>>>> of all types, and ensuring that contributors are recognized
> > >>>>> appropriately.
> > >>>>>>
> > >>>>>> ### Community
> > >>>>>>
> > >>>>>> Heron is currently being used by Twitter, Google, Machine
Zone and
> > >>>>>> ndustrial.io and has received significant contributions
by
> > Microsoft
> > >>>> and
> > >>>>>> Streamlio. By bringing Heron into the Apache ecosystem,
we believe
> > we
> > >>>> can
> > >>>>>> attract even more developers who are interested in creating
> > real-time
> > >>>>>> systems to build the project's contributor base.
> > >>>>>>
> > >>>>>> ### Core Developers
> > >>>>>>
> > >>>>>> Current core developers are engineers from Twitter, Google,
> > Microsoft
> > >>>> and
> > >>>>>> Streamlio.
> > >>>>>>
> > >>>>>> ### Alignment
> > >>>>>>
> > >>>>>> Heron utilizes a number of Apache technologies. Heron leverages
> > Apache
> > >>>>>> ZooKeeper for coordination and has scheduler implementations
to
> > >>>> integrate
> > >>>>>> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN
(via
> > Apache
> > >>>>> REEF)
> > >>>>>> as well as spout implementations to integrate with Apache
Kafka
> and
> > >>>>> metrics
> > >>>>>> implementations to integrate with Scribe. Heron also implements
> the
> > >>>>> Apache
> > >>>>>> Storm user-level API, which allows topologies written against
> Storm
> > to
> > >>>>> run
> > >>>>>> in Heron. We believe that having Heron at Apache will help
further
> > the
> > >>>>>> growth of the streaming compute community, as well as encourage
> > >>>>> cooperation
> > >>>>>> and developer cross pollination with other Apache projects.
> > >>>>>>
> > >>>>>> ## Known Risks
> > >>>>>>
> > >>>>>> ### Orphaned Products
> > >>>>>>
> > >>>>>> The risk of the Heron project being abandoned is minimal.
It is
> used
> > >> in
> > >>>>>> production at Twitter and Google and other companies are
> evaluating
> > or
> > >>>>>> adopting it for production use.
> > >>>>>>
> > >>>>>> ### Inexperience with Open Source
> > >>>>>>
> > >>>>>> All of the core contributors to the project have considerable
> > >>>> experience
> > >>>>>> with open source software development. Bill Graham[2],
Ashvin
> > >>>> Agrawal[3]
> > >>>>>> and Supun Kamburugamuve[4], committers on the project,
are PMCs on
> > >>>> other
> > >>>>>> Apache projects and Bill and Ashvin have gone through the
Apache
> > >>>>> incubator
> > >>>>>> process. Twitter has already donated numerous projects
to the ASF
> > >>>> (e.g.,
> > >>>>>> Apache Mesos, Apache Aurora, Apache Parquet). We also plan
to be
> > >>>> mentored
> > >>>>>> by experienced ASF members that can help with any roadblocks.
> > >>>>>>
> > >>>>>> ### Homogenous Developers
> > >>>>>>
> > >>>>>> Initial committers come from 5 separate organizations.
Our
> intention
> > >> is
> > >>>>>> increase the diversity of contributing developers and their
> > >>>> affiliations.
> > >>>>>> To date github contributions have come from approximately
50
> > >>>> contributors
> > >>>>>> from outside the Twitter team.
> > >>>>>>
> > >>>>>> ### Reliance on Salaried Developers
> > >>>>>>
> > >>>>>> It is expected that Heron development will occur on both
salaried
> > time
> > >>>>> and
> > >>>>>> on volunteer time. The majority of initial committers are
paid by
> > >> their
> > >>>>>> employers to contribute to this project. We are committed
to
> > >> recruiting
> > >>>>>> additional committers from other organizations as well
as
> > non-salaried
> > >>>>>> committers to join project.
> > >>>>>>
> > >>>>>> ### Relationships with Other Apache Products
> > >>>>>>
> > >>>>>> As mentioned in the Alignment section, Heron implements
the Apache
> > >>>> Storm
> > >>>>>> API and integrates with multiple Apache schedulers (Apache
Mesos,
> > >>>> Apache
> > >>>>>> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper
and
> > >> Apache
> > >>>>>> Thrift.
> > >>>>>>
> > >>>>>> ### An Excessive Fascination with the Apache Brand
> > >>>>>>
> > >>>>>> Heron's popularity is growing in the streaming compute
space and
> we
> > >> are
> > >>>>>> long time supporters of the Apache brand. This proposal
is not for
> > the
> > >>>>>> purpose of generating publicity through. Rather, the primary
> > benefits
> > >>>> to
> > >>>>>> joining Apache are those of community building and open
decision
> > >> making
> > >>>>>> outlined in the Rationale section.
> > >>>>>>
> > >>>>>> ## Documentation
> > >>>>>>
> > >>>>>> This proposal exists online as http://wiki.apache.org/
> > >>>>>> incubator/HeronProposal. Extensive documentation can be
found on
> > >> github
> > >>>>> at
> > >>>>>> https://twitter.github.io/heron and the source code is
well
> > >>>> documented.
> > >>>>>>
> > >>>>>> ## Source and Intellectual Property Submission Plan
> > >>>>>>
> > >>>>>> The Heron codebase is currently hosted on Github:
> > >>>>>> https://github.com/twitter/heron. During incubation, the
codebase
> > >> will
> > >>>>> be
> > >>>>>> migrated to Apache infrastructure. The source code is already
ASF
> > 2.0
> > >>>>>> licensed.
> > >>>>>>
> > >>>>>> ## External Dependencies
> > >>>>>>
> > >>>>>> All external libraries have ASF 2.0 compatible licenses
except for
> > >>>>> pylint.
> > >>>>>> The pylint library is GPL licensed, but is only used for
pre-build
> > >>>> Python
> > >>>>>> style checks and is neither bundled with, nor relied upon
by, the
> > >> Heron
> > >>>>>> source or binary release artifacts.
> > >>>>>>
> > >>>>>> ## Cryptography
> > >>>>>>
> > >>>>>> Heron does not use any cryptography libraries.
> > >>>>>>
> > >>>>>> ## Required Resources
> > >>>>>>
> > >>>>>> ### Mailing lists
> > >>>>>>
> > >>>>>> private@heron.incubator.apache.org (with moderated subscriptions)
> > >>>>>> dev@heron.incubator.apache.org
> > >>>>>> commits@heron.incubator.apache.org
> > >>>>>> user@heron.incubator.apache.org
> > >>>>>>
> > >>>>>> ## Subversion Directory
> > >>>>>>
> > >>>>>> Git is the preferred source control system: git://
> > >> git.apache.org/heron
> > >>>>>>
> > >>>>>> ## Issue Tracking
> > >>>>>>
> > >>>>>> JIRA: Heron (HERON)
> > >>>>>>
> > >>>>>> ## Initial Committers
> > >>>>>>
> > >>>>>> * Andrew Jorgensen (andrew at andrewjorgensen dot com)
> > >>>>>> * Ashvin Agrawal (ashvin at apache dot org)*
> > >>>>>> * Avrilia Floratou (avrilia dot floratou at gmail dot com)
> > >>>>>> * Bill Graham (billgraham at apache dot org)*
> > >>>>>> * Brian Hatfield (bmhatfield at gmail dot com)
> > >>>>>> * Chris Kellogg (cckellogg at gmail dot com)
> > >>>>>> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
> > >>>>>> * Karthik Ramasamy (karthik at gmail dot com)
> > >>>>>> * Maosong Fu (maosongfu at gmail dot com)
> > >>>>>> * Neng Lu(freeneng at gmail dot com)
> > >>>>>> * Runhang Li (obj dot runhang at gmail dot com)
> > >>>>>> * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
> > >>>>>> * Supun Kamburugamuve (supun at apache dot org)*
> > >>>>>> * Thomas Sun (tom dot ssf at gmail dot com)
> > >>>>>> * Yaliang Wang (yaliang dot w dot wang at ieee dot org)
> > >>>>>>
> > >>>>>> ## Affiliations
> > >>>>>>
> > >>>>>> * Andrew Jorgensen (Google)
> > >>>>>> * Ashvin Agrawal (Microsoft)
> > >>>>>> * Avrilia Floratou (Microsoft)
> > >>>>>> * Bill Graham (Twitter)
> > >>>>>> * Brian Hatfield (Google)
> > >>>>>> * Chris Kellogg (Twitter)
> > >>>>>> * Huijun Wu (Twitter)
> > >>>>>> * Karthik Ramasamy (Streamlio)
> > >>>>>> * Maosong Fu (Twitter)
> > >>>>>> * Neng Lu (Twitter)
> > >>>>>> * Runhang Li (Twitter)
> > >>>>>> * Sanjeev Kulkarni (Streamlio)
> > >>>>>> * Supun Kamburugamuve (Indiana University)
> > >>>>>> * Thomas Sun (Twitter)
> > >>>>>> * Yaliang Wang (Twitter)
> > >>>>>>
> > >>>>>> ## Sponsors
> > >>>>>>
> > >>>>>> ### Champion
> > >>>>>>
> > >>>>>> * Julien Le Dem (julien at apache dot org)
> > >>>>>>
> > >>>>>> ### Nominated Mentors
> > >>>>>>
> > >>>>>> * Jake Farrell (jfarrell at apache dot org)
> > >>>>>> * Jacques Nadeau (jacques at apache dot org)
> > >>>>>> * Julien Le Dem (julien at apache dot org)
> > >>>>>>
> > >>>>>> ### Sponsoring Entity
> > >>>>>>
> > >>>>>> The Apache Incubator
> > >>>>>>
> > >>>>>> ### Footnotes
> > >>>>>>
> > >>>>>> 1 - Papers detailing Heron are available at
> > >> http://dl.acm.org/citation
> > >>>> .
> > >>>>>> cfm?id=2742788 and http://sites.computer.org/
> debull/A15dec/p15.pdf.
> > >>>>>> 2 - http://home.apache.org/phonebook.html?uid=billgraham
> > >>>>>> 3 - http://home.apache.org/phonebook.html?uid=ashvin
> > >>>>>> 4 - http://home.apache.org/phonebook.html?uid=supun
> > >>>>>
> > >>>>> ------------------------------------------------------------
> > ---------
> > >>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >>>>> For additional commands, e-mail: general-help@incubator.apache.org
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Supun Kamburugamuve
> > >>> Member, Apache Software Foundation; http://www.apache.org
> > >>> E-mail: supun@apache.o <supun06@gmail.com>rg;  Mobile: +1 812
219
> 2563
> > >>> <(812)%20219-2563>
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >> For additional commands, e-mail: general-help@incubator.apache.org
> > >>
> > >>
> > >
> > >
> > > --
> > > ~/William
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message