incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "P. Taylor Goetz" <ptgo...@gmail.com>
Subject Re: [PROPOSAL] Heron
Date Thu, 15 Jun 2017 23:33:56 GMT
Thanks for the clarification Bill!

-Taylor


> On Jun 15, 2017, at 1:24 PM, Bill Graham <billgraham@gmail.com> wrote:
> 
> Hi Taylor,
> 
> The Heron team engaged with members of the Apache Storm community through
> private channels before the project was made available as open source. We
> recognize this is not the ideal approach and going forward we will use more
> collaborative methods as we progress and grow the Heron community.
> 
> One of our goals during incubation will be to use open forums of
> communication, like the Apache mailing lists, and work to foster a truly
> collaborative environment for both Apache Storm and Heron community members
> to work within together.
> 
> The Fabric team at Google uses Heron extensively.
> 
> thanks,
> Bill
> 
> On Thu, Jun 15, 2017 at 10:42 AM, Debo Dutta (dedutta) <dedutta@cisco.com>
> wrote:
> 
>> Am happy to help too!
>> 
>> Thx
>> Debo
>> 
>> Sent from my iPhone
>> 
>>> On Jun 14, 2017, at 8:05 PM, William Markito Oliveira <
>> william.markito@gmail.com> wrote:
>>> 
>>> Howdy!
>>> 
>>> If Heron is looking for some help around incubation process, I'd love to
>>> help while Geode experience is still fresh in my mind and given that
>> it's a
>>> project/space that I do have interest. Since I'm not an ASF member, I
>> don't
>>> think I can offer to be a mentor, but can probably still help and
>>> participate on the process.
>>> 
>>> Thanks!
>>> 
>>>> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz <ptgoetz@gmail.com>
>> wrote:
>>>> 
>>>> Hi Bill/Supun,
>>>> 
>>>> Sorry for not being a little more clear. I was asking more about how the
>>>> Heron community would seek to engage with Storm community at the
>>>> *community* level as opposed to the technical level (i.e. “Community
>> over
>>>> Code”).
>>>> 
>>>> I’ve been asked by many why this has never happened, and have always
>>>> struggled to answer. Maybe you could help answer that question as well
>> as
>>>> if and how that might change if Heron were to incubate.
>>>> 
>>>> Another quick question: The proposal mentions Heron being used in
>>>> production at Google, but some Google employees I recently spoke to
>> seemed
>>>> to contradict that. Could you explain? Note that’s nothing that would
>>>> preclude the project from incubating, I’m just curious.
>>>> 
>>>> -Taylor
>>>> 
>>>>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve <supun06@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Hi Taylor,
>>>>> 
>>>>> For me, one of the interesting differences between Heron and Storm is
>> the
>>>>> execution model. Storm uses a shared memory model while Heron uses a
>>>>> process based model. It will be interesting to see how these two
>> evolve.
>>>>> 
>>>>> Thanks,
>>>>> Supun..
>>>>> 
>>>>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham <billgraham@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> Hi Taylor,
>>>>>> 
>>>>>> Thanks for the mentor offer, we'd be glad to have your help.
>>>>>> 
>>>>>> I think the best place for collaboration would be around the evolution
>>>> of
>>>>>> the API. In addition we plan to look more into DSL solutions which
we
>>>> could
>>>>>> potentially collaborate on. This could be Trident, or Beam or
>> something
>>>>>> else, but there could be synergies for future development here.
>>>>>> 
>>>>>> thanks,
>>>>>> Bill
>>>>>> 
>>>>>> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz <ptgoetz@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> Hi Bill,
>>>>>>> 
>>>>>>> Could you comment on how/if the Heron community would be willing
to
>>>> work
>>>>>>> with the Storm community? I've seen a number of new features
in Storm
>>>>>> being
>>>>>>> ported to Heron, but I have yet to see any attempt by the Heron
>>>> community
>>>>>>> to engage with the Apache Storm community.
>>>>>>> 
>>>>>>> I don't think it would be too far off to say that the relationship
>>>>>> between
>>>>>>> Heron and Apache Storm has been somewhat adversarial. The pre-
and
>>>>>>> post-open sourcing marketing around Heron seemed, at least to
me,
>>>>>> somewhat
>>>>>>> aggressively negative toward Storm.
>>>>>>> 
>>>>>>> As a peer to Apache Storm, how would the proposed "Apache Heron"
>>>>>> community
>>>>>>> work to collaborate with the Storm community? If Heron is adopting
>> API
>>>>>>> changes in Storm, then it seems there is an opportunity for
>>>>>> collaboration.
>>>>>>> 
>>>>>>> Don't take any of this as an objection to incubating the project.
I
>>>> would
>>>>>>> support it. I would also be willing to be a mentor, if you would
>>>> consider
>>>>>>> taking on another.
>>>>>>> 
>>>>>>> -Taylor
>>>>>>> 
>>>>>>>> On Jun 8, 2017, at 1:23 PM, Bill Graham <billgraham@gmail.com>
>> wrote:
>>>>>>>> 
>>>>>>>> Dear Apache Incubator Community,
>>>>>>>> 
>>>>>>>> We are excited to share our proposal for discussion and feedback
>>>>>>>> for entering Apache Incubation. Heron is a real-time, distributed,
>>>>>>>> fault-tolerant stream processing engine.
>>>>>>>> 
>>>>>>>> Our proposal can be found at https://wiki.apache.org/
>>>>>>> incubator/HeronProposal
>>>>>>>> and is included below.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> 
>>>>>>>> Bill Graham on behalf of the Heron developers
>>>>>>>> 
>>>>>>>> 
>>>>>>>> # Heron Proposal
>>>>>>>> 
>>>>>>>> ## Abstract
>>>>>>>> Heron is a real-time, distributed, fault-tolerant stream
processing
>>>>>>> engine
>>>>>>>> initially developed by Twitter.
>>>>>>>> 
>>>>>>>> ## Proposal
>>>>>>>> 
>>>>>>>> Heron is a real-time stream processing engine built for high
>>>>>> performance,
>>>>>>>> ease of manageability, performance predictability and developer
>>>>>>>> productivity[1]. We wish to develop a community around Heron
to
>>>>>> increase
>>>>>>>> contributions and see Heron thrive in an open forum.
>>>>>>>> 
>>>>>>>> ## Background
>>>>>>>> 
>>>>>>>> Heron provides the ability for developers to compose directed
>> acyclic
>>>>>>>> graphs (DAGs) of real-time query execution logic (i.e. a
topology)
>> and
>>>>>>>> submit the topology to execute on a pluggable job scheduling
system
>>>>>>> (e.g.,
>>>>>>>> Apache Aurora, YARN, Marathon, etc). Users can employ either
the
>>>> native
>>>>>>>> Heron API or the Apache Storm API to develop the topology.
Heron
>>>>>> supports
>>>>>>>> the Storm API for ease of migration, but beyond that Heron’s
>>>>>> architecture
>>>>>>>> differs considerably from Storm’s.
>>>>>>>> 
>>>>>>>> Users submit a topology to the scheduler using the Heron
client,
>> which
>>>>>>> uses
>>>>>>>> the Heron binary libraries to deploy all daemons required
to run and
>>>>>>> manage
>>>>>>>> the topology. The topology therefore has no reliance on centrally
>>>>>> managed
>>>>>>>> Heron services, only on a generic job scheduling system,
which lends
>>>>>>> itself
>>>>>>>> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN
>>>>>> (among
>>>>>>>> others).
>>>>>>>> 
>>>>>>>> The scheduler runs each topology as a job consisting of multiple
>>>>>>>> containers. One of the containers runs the topology master,
>>>> responsible
>>>>>>> for
>>>>>>>> managing the topology. The remaining containers each runs
a stream
>>>>>>> manager
>>>>>>>> responsible for data routing, a metrics manager that collects
and
>>>>>> reports
>>>>>>>> various metrics and a number of processes called Heron instances
>> which
>>>>>>> run
>>>>>>>> the user-defined logic on the stream of tuples. Parallelism
is
>>>> achieved
>>>>>>> via
>>>>>>>> process-based isolation of Heron instances, which provides
>> predictable
>>>>>>>> performance while simplifying debugging. The containers are
>> allocated
>>>>>> and
>>>>>>>> managed by the scheduler framework based on resource availability
of
>>>>>>> nodes
>>>>>>>> in the cluster. The metadata for the topology, such as the
physical
>>>>>> plan
>>>>>>>> and execution details, are stored in the pluggable Heron
State
>> Manager
>>>>>>>> (e.g. Apache ZooKeeper).
>>>>>>>> 
>>>>>>>> ## Rationale
>>>>>>>> 
>>>>>>>> Heron is a general-purpose, modular and extensible platform
that can
>>>> be
>>>>>>>> leveraged to support common, real-time analytics use cases.
There is
>>>> an
>>>>>>>> increasing demand for open-source, scalable real-time analytics
>>>>>> systems.
>>>>>>> We
>>>>>>>> believe that Heron can be leveraged by other organizations
to build
>>>>>>>> streaming applications that can benefit from its robustness,
high
>>>>>>>> performance, adaptability to cloud environments and ease
of use.
>>>>>>> Moreover,
>>>>>>>> we hope that open-sourcing Heron will help to further evolve
the
>>>>>>> technology
>>>>>>>> as the project attracts contributors with diverse backgrounds
and
>>>> areas
>>>>>>> of
>>>>>>>> expertise.
>>>>>>>> 
>>>>>>>> We believe the Apache foundation is a great fit as the long-term
>> home
>>>>>> for
>>>>>>>> Heron, as it provides an established process for community-driven
>>>>>>>> development and decision making by consensus. This is exactly
the
>>>> model
>>>>>>> we
>>>>>>>> want for future Heron development.
>>>>>>>> 
>>>>>>>> ## Initial Goals
>>>>>>>> 
>>>>>>>> * Move the existing codebase, website, documentation, and
mailing
>>>> lists
>>>>>>> to
>>>>>>>> Apache-hosted infrastructure.
>>>>>>>> * Integrate with the Apache development process.
>>>>>>>> * Ensure all dependencies are compliant with Apache License
version
>>>>>> 2.0.
>>>>>>>> * Incrementally develop and release per Apache guidelines.
>>>>>>>> 
>>>>>>>> ## Current Status
>>>>>>>> 
>>>>>>>> Heron is a stable project used in production at Twitter since
2014
>> and
>>>>>>> open
>>>>>>>> sourced under the ASL v2 license in 2016. The Heron source
code is
>>>>>>>> currently hosted at github.com (https://github.com/twitter/heron),
>>>>>> which
>>>>>>>> will seed the Apache git repository.
>>>>>>>> 
>>>>>>>> ### Meritocracy
>>>>>>>> 
>>>>>>>> By submitting this incubator proposal, we’re expressing
our intent
>> to
>>>>>>> build
>>>>>>>> a diverse developer community around Heron that will conduct
itself
>>>>>>>> according to The Apache Way and use a meritocratic means
of building
>>>>>> it's
>>>>>>>> committer base. Several companies and universities have already
>>>>>> expressed
>>>>>>>> interest in and contributed to Heron. Our goal is to grow
the Heron
>>>>>>>> community by encouraging open communication, contribution
and
>>>>>>> participation
>>>>>>>> of all types, and ensuring that contributors are recognized
>>>>>>> appropriately.
>>>>>>>> 
>>>>>>>> ### Community
>>>>>>>> 
>>>>>>>> Heron is currently being used by Twitter, Google, Machine
Zone and
>>>>>>>> ndustrial.io and has received significant contributions by
>> Microsoft
>>>>>> and
>>>>>>>> Streamlio. By bringing Heron into the Apache ecosystem, we
believe
>> we
>>>>>> can
>>>>>>>> attract even more developers who are interested in creating
>> real-time
>>>>>>>> systems to build the project's contributor base.
>>>>>>>> 
>>>>>>>> ### Core Developers
>>>>>>>> 
>>>>>>>> Current core developers are engineers from Twitter, Google,
>> Microsoft
>>>>>> and
>>>>>>>> Streamlio.
>>>>>>>> 
>>>>>>>> ### Alignment
>>>>>>>> 
>>>>>>>> Heron utilizes a number of Apache technologies. Heron leverages
>> Apache
>>>>>>>> ZooKeeper for coordination and has scheduler implementations
to
>>>>>> integrate
>>>>>>>> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN
(via
>> Apache
>>>>>>> REEF)
>>>>>>>> as well as spout implementations to integrate with Apache
Kafka and
>>>>>>> metrics
>>>>>>>> implementations to integrate with Scribe. Heron also implements
the
>>>>>>> Apache
>>>>>>>> Storm user-level API, which allows topologies written against
Storm
>> to
>>>>>>> run
>>>>>>>> in Heron. We believe that having Heron at Apache will help
further
>> the
>>>>>>>> growth of the streaming compute community, as well as encourage
>>>>>>> cooperation
>>>>>>>> and developer cross pollination with other Apache projects.
>>>>>>>> 
>>>>>>>> ## Known Risks
>>>>>>>> 
>>>>>>>> ### Orphaned Products
>>>>>>>> 
>>>>>>>> The risk of the Heron project being abandoned is minimal.
It is used
>>>> in
>>>>>>>> production at Twitter and Google and other companies are
evaluating
>> or
>>>>>>>> adopting it for production use.
>>>>>>>> 
>>>>>>>> ### Inexperience with Open Source
>>>>>>>> 
>>>>>>>> All of the core contributors to the project have considerable
>>>>>> experience
>>>>>>>> with open source software development. Bill Graham[2], Ashvin
>>>>>> Agrawal[3]
>>>>>>>> and Supun Kamburugamuve[4], committers on the project, are
PMCs on
>>>>>> other
>>>>>>>> Apache projects and Bill and Ashvin have gone through the
Apache
>>>>>>> incubator
>>>>>>>> process. Twitter has already donated numerous projects to
the ASF
>>>>>> (e.g.,
>>>>>>>> Apache Mesos, Apache Aurora, Apache Parquet). We also plan
to be
>>>>>> mentored
>>>>>>>> by experienced ASF members that can help with any roadblocks.
>>>>>>>> 
>>>>>>>> ### Homogenous Developers
>>>>>>>> 
>>>>>>>> Initial committers come from 5 separate organizations. Our
intention
>>>> is
>>>>>>>> increase the diversity of contributing developers and their
>>>>>> affiliations.
>>>>>>>> To date github contributions have come from approximately
50
>>>>>> contributors
>>>>>>>> from outside the Twitter team.
>>>>>>>> 
>>>>>>>> ### Reliance on Salaried Developers
>>>>>>>> 
>>>>>>>> It is expected that Heron development will occur on both
salaried
>> time
>>>>>>> and
>>>>>>>> on volunteer time. The majority of initial committers are
paid by
>>>> their
>>>>>>>> employers to contribute to this project. We are committed
to
>>>> recruiting
>>>>>>>> additional committers from other organizations as well as
>> non-salaried
>>>>>>>> committers to join project.
>>>>>>>> 
>>>>>>>> ### Relationships with Other Apache Products
>>>>>>>> 
>>>>>>>> As mentioned in the Alignment section, Heron implements the
Apache
>>>>>> Storm
>>>>>>>> API and integrates with multiple Apache schedulers (Apache
Mesos,
>>>>>> Apache
>>>>>>>> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper
and
>>>> Apache
>>>>>>>> Thrift.
>>>>>>>> 
>>>>>>>> ### An Excessive Fascination with the Apache Brand
>>>>>>>> 
>>>>>>>> Heron's popularity is growing in the streaming compute space
and we
>>>> are
>>>>>>>> long time supporters of the Apache brand. This proposal is
not for
>> the
>>>>>>>> purpose of generating publicity through. Rather, the primary
>> benefits
>>>>>> to
>>>>>>>> joining Apache are those of community building and open decision
>>>> making
>>>>>>>> outlined in the Rationale section.
>>>>>>>> 
>>>>>>>> ## Documentation
>>>>>>>> 
>>>>>>>> This proposal exists online as http://wiki.apache.org/
>>>>>>>> incubator/HeronProposal. Extensive documentation can be found
on
>>>> github
>>>>>>> at
>>>>>>>> https://twitter.github.io/heron and the source code is well
>>>>>> documented.
>>>>>>>> 
>>>>>>>> ## Source and Intellectual Property Submission Plan
>>>>>>>> 
>>>>>>>> The Heron codebase is currently hosted on Github:
>>>>>>>> https://github.com/twitter/heron. During incubation, the
codebase
>>>> will
>>>>>>> be
>>>>>>>> migrated to Apache infrastructure. The source code is already
ASF
>> 2.0
>>>>>>>> licensed.
>>>>>>>> 
>>>>>>>> ## External Dependencies
>>>>>>>> 
>>>>>>>> All external libraries have ASF 2.0 compatible licenses except
for
>>>>>>> pylint.
>>>>>>>> The pylint library is GPL licensed, but is only used for
pre-build
>>>>>> Python
>>>>>>>> style checks and is neither bundled with, nor relied upon
by, the
>>>> Heron
>>>>>>>> source or binary release artifacts.
>>>>>>>> 
>>>>>>>> ## Cryptography
>>>>>>>> 
>>>>>>>> Heron does not use any cryptography libraries.
>>>>>>>> 
>>>>>>>> ## Required Resources
>>>>>>>> 
>>>>>>>> ### Mailing lists
>>>>>>>> 
>>>>>>>> private@heron.incubator.apache.org (with moderated subscriptions)
>>>>>>>> dev@heron.incubator.apache.org
>>>>>>>> commits@heron.incubator.apache.org
>>>>>>>> user@heron.incubator.apache.org
>>>>>>>> 
>>>>>>>> ## Subversion Directory
>>>>>>>> 
>>>>>>>> Git is the preferred source control system: git://
>>>> git.apache.org/heron
>>>>>>>> 
>>>>>>>> ## Issue Tracking
>>>>>>>> 
>>>>>>>> JIRA: Heron (HERON)
>>>>>>>> 
>>>>>>>> ## Initial Committers
>>>>>>>> 
>>>>>>>> * Andrew Jorgensen (andrew at andrewjorgensen dot com)
>>>>>>>> * Ashvin Agrawal (ashvin at apache dot org)*
>>>>>>>> * Avrilia Floratou (avrilia dot floratou at gmail dot com)
>>>>>>>> * Bill Graham (billgraham at apache dot org)*
>>>>>>>> * Brian Hatfield (bmhatfield at gmail dot com)
>>>>>>>> * Chris Kellogg (cckellogg at gmail dot com)
>>>>>>>> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
>>>>>>>> * Karthik Ramasamy (karthik at gmail dot com)
>>>>>>>> * Maosong Fu (maosongfu at gmail dot com)
>>>>>>>> * Neng Lu(freeneng at gmail dot com)
>>>>>>>> * Runhang Li (obj dot runhang at gmail dot com)
>>>>>>>> * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
>>>>>>>> * Supun Kamburugamuve (supun at apache dot org)*
>>>>>>>> * Thomas Sun (tom dot ssf at gmail dot com)
>>>>>>>> * Yaliang Wang (yaliang dot w dot wang at ieee dot org)
>>>>>>>> 
>>>>>>>> ## Affiliations
>>>>>>>> 
>>>>>>>> * Andrew Jorgensen (Google)
>>>>>>>> * Ashvin Agrawal (Microsoft)
>>>>>>>> * Avrilia Floratou (Microsoft)
>>>>>>>> * Bill Graham (Twitter)
>>>>>>>> * Brian Hatfield (Google)
>>>>>>>> * Chris Kellogg (Twitter)
>>>>>>>> * Huijun Wu (Twitter)
>>>>>>>> * Karthik Ramasamy (Streamlio)
>>>>>>>> * Maosong Fu (Twitter)
>>>>>>>> * Neng Lu (Twitter)
>>>>>>>> * Runhang Li (Twitter)
>>>>>>>> * Sanjeev Kulkarni (Streamlio)
>>>>>>>> * Supun Kamburugamuve (Indiana University)
>>>>>>>> * Thomas Sun (Twitter)
>>>>>>>> * Yaliang Wang (Twitter)
>>>>>>>> 
>>>>>>>> ## Sponsors
>>>>>>>> 
>>>>>>>> ### Champion
>>>>>>>> 
>>>>>>>> * Julien Le Dem (julien at apache dot org)
>>>>>>>> 
>>>>>>>> ### Nominated Mentors
>>>>>>>> 
>>>>>>>> * Jake Farrell (jfarrell at apache dot org)
>>>>>>>> * Jacques Nadeau (jacques at apache dot org)
>>>>>>>> * Julien Le Dem (julien at apache dot org)
>>>>>>>> 
>>>>>>>> ### Sponsoring Entity
>>>>>>>> 
>>>>>>>> The Apache Incubator
>>>>>>>> 
>>>>>>>> ### Footnotes
>>>>>>>> 
>>>>>>>> 1 - Papers detailing Heron are available at
>>>> http://dl.acm.org/citation
>>>>>> .
>>>>>>>> cfm?id=2742788 and http://sites.computer.org/debull/A15dec/p15.pdf.
>>>>>>>> 2 - http://home.apache.org/phonebook.html?uid=billgraham
>>>>>>>> 3 - http://home.apache.org/phonebook.html?uid=ashvin
>>>>>>>> 4 - http://home.apache.org/phonebook.html?uid=supun
>>>>>>> 
>>>>>>> ------------------------------------------------------------
>> ---------
>>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Supun Kamburugamuve
>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>> E-mail: supun@apache.o <supun06@gmail.com>rg;  Mobile: +1 812 219
2563
>>>>> <(812)%20219-2563>
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> ~/William
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message