incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John D. Ament" <johndam...@apache.org>
Subject Re: [VOTE] Heron to enter Apache Incubator
Date Fri, 23 Jun 2017 12:08:51 GMT
Bill,

Would I be correct in understanding that Heron implements the same protocol
as Storm, but the actual implementation is different?

John

On Fri, Jun 23, 2017 at 1:36 AM Bill Graham <billgraham@gmail.com> wrote:

> It's grossly inaccurate to refer to Heron as a Storm fork. There are about
> 132k lines of code in the Heron codebase (plus 166k of codegen), of which
> about 7k are to implement the Apache Storm API bindings to the Heron API.
>
> The Rationale section of the proposal discusses the Heron architecture,
> which is a complete rewrite with little in common with Storm. The only
> overlap is that Heron supports the Storm user API for ease of migration.
>
> The value of having multiple projects to solve a common need is that each
> can foster innovation, collaboration and exchange of ideas in different
> ways. This is not a new concept to Apache. You can look at the incubator
> discussions around Accumulo vs HBase (two implementations of the BigTable
> paper) for example, to see how two different approaches to a shared problem
> can be a good thing.
>
> thanks,
> Bill
>
> On Thu, Jun 22, 2017 at 6:45 PM, Von Gosling <vongosling@apache.org>
> wrote:
>
> > Hi,
> >
> > I will give +1(Non-binding), but,
> >
> > I have the similar question about so many streaming framework in the
> > apache, how to develop community for themselves.
> >
> >
> >
> >
> > Best Regards,
> > Von Gosling
> >
> >
> >
> > 在 2017年6月23日,08:51,Edward Capriolo <edlinuxguru@gmail.com> 写道:
> >
> > I believe heron and storm should be merged back together. I do not see
> the
> > value of storm and a storm fork in the asf.
> >
> > On Thursday, June 22, 2017, Bill Graham <billgraham@gmail.com> wrote:
> >
> > Thanks Taylor for relaying these sentiments, especially the part about
> the
> > Heron website which is indeed poorly worded (I suspect this could have
> been
> > the result of internal docs being open-sourced). I've opened this pull
> > request to update the language regarding Storm:
> >
> > https://github.com/twitter/heron/pull/1979
> >
> > On Thu, Jun 22, 2017 at 12:21 PM, P. Taylor Goetz <ptgoetz@gmail.com
> > <javascript:;>> wrote:
> >
> > The Apache Storm PMC had a discussion regarding the Heron proposal. In
> >
> > the
> >
> > spirit of openness I wanted to bring some of the sentiments expressed in
> > that discussion back to this list. Please note that I am paraphrasing
> >
> > from
> >
> > that discussion and attempting to relay opinions of the collective PMC,
> >
> > not
> >
> > necessarily that of any individual.
> >
> > * There is a general disappointment that the Heron community chose not to
> > engage with the Storm community and instead chose a separate path.
> > * A majority of the PMC supports Heron’s incubation, though some felt it
> > would result in unnecessary duplication of effort.
> > * A majority of the PMC supports the two projects working closely
> > together. A number of PMC members suggested the two projects merge in
> >
> > some
> >
> > way.
> > * Many PMC members took issue some of the marketing language on the Heron
> > website, particularly Heron being billed as “the direct successor to
> >
> > Apache
> >
> > Storm” and the prominent “Upgrade from Storm” links.  The main concern
> >
> > here
> >
> > was such phrasing has somewhat of a hostile tone and undermines the
> >
> > desire
> >
> > for better collaboration, as well as confusing users.
> >
> > One of my goals as a proposed mentor for Heron and a Storm PMC member is
> > to address some of these concerns and encourage collaboration. As I
> > mentioned to the Storm PMC on that thread, if there are ongoing concerns
> > from either the Storm PMC or the Heron PPMC about me acting as a mentor,
> >
> > I
> >
> > would be willing to step down.
> >
> > +1 (binding)
> >
> > -Taylor
> >
> > On Jun 16, 2017, at 4:41 PM, Bill Graham <billgraham@gmail.com
> >
> > <javascript:;>> wrote:
> >
> >
> > Hi,
> >
> > Based on the discussion on the incubator mailing list[1] I would like
> >
> > to
> >
> > call a vote to add Heron to the Apache Incubator.
> >
> > The full proposal is available below, and is also available on the
> >
> > Apache
> >
> > Incubator wiki at:
> >   https://wiki.apache.org/incubator/HeronProposal
> >
> > Please vote:
> > [ ] +1, bring Heron into Incubator
> > [ ] -1, do not bring Heron into Incubator, because...
> >
> > The vote will open for 7 days until Friday June 23 at 14:00 PT.
> >
> > Thank you
> >
> > 1 -
> > https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7
> >
> > 786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E
> >
> >
> >
> >
> > = Heron Proposal =
> >
> > = Abstract =
> > Heron is a real-time, distributed, fault-tolerant stream processing
> >
> > engine
> >
> > initially developed by Twitter.
> >
> > = Proposal =
> >
> > Heron is a real-time stream processing engine built for high
> >
> > performance,
> >
> > ease of manageability, performance predictability and developer
> > productivity[1]. We wish to develop a community around Heron to
> >
> > increase
> >
> > contributions and see Heron thrive in an open forum.
> >
> > = Background =
> >
> > Heron provides the ability for developers to compose directed acyclic
> > graphs (DAGs) of real-time query execution logic (i.e. a topology) and
> > submit the topology to execute on a pluggable job scheduling system
> >
> > (e.g.,
> >
> > Apache Aurora, YARN, Marathon, etc). Users can employ either the native
> > Heron API or the Apache Storm API to develop the topology. Heron
> >
> > supports
> >
> > the Storm API for ease of migration, but beyond that Heron’s
> >
> > architecture
> >
> > differs considerably from Storm’s.
> >
> > Users submit a topology to the scheduler using the Heron client, which
> >
> > uses
> >
> > the Heron binary libraries to deploy all daemons required to run and
> >
> > manage
> >
> > the topology. The topology therefore has no reliance on centrally
> >
> > managed
> >
> > Heron services, only on a generic job scheduling system, which lends
> >
> > itself
> >
> > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN
> >
> > (among
> >
> > others).
> >
> > The scheduler runs each topology as a job consisting of multiple
> > containers. One of the containers runs the topology master, responsible
> >
> > for
> >
> > managing the topology. The remaining containers each runs a stream
> >
> > manager
> >
> > responsible for data routing, a metrics manager that collects and
> >
> > reports
> >
> > various metrics and a number of processes called Heron instances which
> >
> > run
> >
> > the user-defined logic on the stream of tuples. Parallelism is achieved
> >
> > via
> >
> > process-based isolation of Heron instances, which provides predictable
> > performance while simplifying debugging. The containers are allocated
> >
> > and
> >
> > managed by the scheduler framework based on resource availability of
> >
> > nodes
> >
> > in the cluster. The metadata for the topology, such as the physical
> >
> > plan
> >
> > and execution details, are stored in the pluggable Heron State Manager
> > (e.g. Apache ZooKeeper).
> >
> > = Rationale =
> >
> > Heron is a general-purpose, modular and extensible platform that can be
> > leveraged to support common, real-time analytics use cases. There is an
> > increasing demand for open-source, scalable real-time analytics
> >
> > systems.
> >
> > We
> >
> > believe that Heron can be leveraged by other organizations to build
> > streaming applications that can benefit from its robustness, high
> > performance, adaptability to cloud environments and ease of use.
> >
> > Moreover,
> >
> > we hope that open-sourcing Heron will help to further evolve the
> >
> > technology
> >
> > as the project attracts contributors with diverse backgrounds and areas
> >
> > of
> >
> > expertise.
> >
> > We believe the Apache foundation is a great fit as the long-term home
> >
> > for
> >
> > Heron, as it provides an established process for community-driven
> > development and decision making by consensus. This is exactly the model
> >
> > we
> >
> > want for future Heron development.
> >
> > = Initial Goals =
> >
> > * Move the existing codebase, website, documentation, and mailing lists
> >
> > to
> >
> > Apache-hosted infrastructure.
> > * Integrate with the Apache development process.
> > * Ensure all dependencies are compliant with Apache License version
> >
> > 2.0.
> >
> > * Incrementally develop and release per Apache guidelines.
> >
> > = Current Status =
> >
> > Heron is a stable project used in production at Twitter since 2014 and
> >
> > open
> >
> > sourced under the ASL v2 license in 2016. The Heron source code is
> > currently hosted at github.com (https://github.com/twitter/heron),
> >
> > which
> >
> > will seed the Apache git repository.
> >
> > = Meritocracy =
> >
> > By submitting this incubator proposal, we’re expressing our intent to
> >
> > build
> >
> > a diverse developer community around Heron that will conduct itself
> > according to The Apache Way and use a meritocratic means of building
> >
> > it's
> >
> > committer base. Several companies and universities have already
> >
> > expressed
> >
> > interest in and contributed to Heron. Our goal is to grow the Heron
> > community by encouraging open communication, contribution and
> >
> > participation
> >
> > of all types, and ensuring that contributors are recognized
> >
> > appropriately.
> >
> >
> > = Community =
> >
> > Heron is currently being used by Twitter, Google, Machine Zone and
> > ndustrial.io and has received significant contributions by Microsoft
> >
> > and
> >
> > Streamlio. By bringing Heron into the Apache ecosystem, we believe we
> >
> > can
> >
> > attract even more developers who are interested in creating real-time
> > systems to build the project's contributor base.
> >
> > == Core Developers ==
> >
> > Current core developers are engineers from Twitter, Google, Microsoft
> >
> > and
> >
> > Streamlio.
> >
> > == Alignment ==
> >
> > Heron utilizes a number of Apache technologies. Heron leverages Apache
> > ZooKeeper for coordination and has scheduler implementations to
> >
> > integrate
> >
> > with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache
> >
> > REEF)
> >
> > as well as spout implementations to integrate with Apache Kafka and
> >
> > metrics
> >
> > implementations to integrate with Scribe. Heron also implements the
> >
> > Apache
> >
> > Storm user-level API, which allows topologies written against Storm to
> >
> > run
> >
> > in Heron. We believe that having Heron at Apache will help further the
> > growth of the streaming compute community, as well as encourage
> >
> > cooperation
> >
> > and developer cross pollination with other Apache projects.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > The risk of the Heron project being abandoned is minimal. It is used in
> > production at Twitter and Google and other companies are evaluating or
> > adopting it for production use.
> >
> > == Inexperience with Open Source ==
> >
> > All of the core contributors to the project have considerable
> >
> > experience
> >
> > with open source software development. Bill Graham[2], Ashvin
> >
> > Agrawal[3]
> >
> > and Supun Kamburugamuve[4], committers on the project, are PMCs on
> >
> > other
> >
> > Apache projects and Bill and Ashvin have gone through the Apache
> >
> > incubator
> >
> > process. Twitter has already donated numerous projects to the ASF
> >
> > (e.g.,
> >
> > Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be
> >
> > mentored
> >
> > by experienced ASF members that can help with any roadblocks.
> >
> > == Homogenous Developers ==
> >
> > Initial committers come from 5 separate organizations. Our intention is
> > increase the diversity of contributing developers and their
> >
> > affiliations.
> >
> > To date github contributions have come from approximately 50
> >
> > contributors
> >
> > from outside the Twitter team.
> >
> > == Reliance on Salaried Developers ==
> >
> > It is expected that Heron development will occur on both salaried time
> >
> > and
> >
> > on volunteer time. The majority of initial committers are paid by their
> > employers to contribute to this project. We are committed to recruiting
> > additional committers from other organizations as well as non-salaried
> > committers to join project.
> >
> > == Relationships with Other Apache Products ==
> >
> > As mentioned in the Alignment section, Heron implements the Apache
> >
> > Storm
> >
> > API and integrates with multiple Apache schedulers (Apache Mesos,
> >
> > Apache
> >
> > Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache
> > Thrift.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > Heron's popularity is growing in the streaming compute space and we are
> > long time supporters of the Apache brand. This proposal is not for the
> > purpose of generating publicity through. Rather, the primary benefits
> >
> > to
> >
> > joining Apache are those of community building and open decision making
> > outlined in the Rationale section.
> >
> > == Documentation ==
> >
> > This proposal exists online as
> > http://wiki.apache.org/incubator/HeronProposal. Extensive
> >
> > documentation
> >
> > can
> >
> > be found on github at https://twitter.github.io/heron and the source
> >
> > code
> >
> > is well documented.
> >
> > == Source and Intellectual Property Submission Plan ==
> >
> > The Heron codebase is currently hosted on Github:
> > https://github.com/twitter/heron. During incubation, the codebase will
> >
> > be
> >
> > migrated to Apache infrastructure. The source code is already ASF 2.0
> > licensed.
> >
> > == External Dependencies ==
> >
> > All external libraries have ASF 2.0 compatible licenses except for
> >
> > pylint.
> >
> > The pylint library is GPL licensed, but is only used for pre-build
> >
> > Python
> >
> > style checks and is neither bundled with, nor relied upon by, the Heron
> > source or binary release artifacts.
> >
> > == Cryptography ==
> >
> > Heron does not use any cryptography libraries.
> >
> > = Required Resources =
> >
> > == Mailing lists ==
> >
> > * private@heron.incubator.apache.org <javascript:;> (with moderated
> >
> > subscriptions)
> >
> > * dev@heron.incubator.apache.org <javascript:;>
> > * commits@heron.incubator.apache.org <javascript:;>
> > * user@heron.incubator.apache.org <javascript:;>
> >
> >
> > == Subversion Directory ==
> >
> > Git is the preferred source control system: git://git.apache.org/heron
> >
> > == Issue Tracking ==
> >
> > JIRA: Heron (HERON)
> >
> > == Initial Committers ==
> >
> > * Andrew Jorgensen (andrew at andrewjorgensen dot com)
> > * Ashvin Agrawal (ashvin at apache dot org)*
> > * Avrilia Floratou (avrilia dot floratou at gmail dot com)
> > * Bill Graham (billgraham at apache dot org)*
> > * Brian Hatfield (bmhatfield at gmail dot com)
> > * Chris Kellogg (cckellogg at gmail dot com)
> > * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
> > * Karthik Ramasamy (karthik at gmail dot com)
> > * Maosong Fu (maosongfu at gmail dot com)
> > * Neng Lu(freeneng at gmail dot com)
> > * Runhang Li (obj dot runhang at gmail dot com)
> > * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
> > * Supun Kamburugamuve (supun at apache dot org)*
> > * Thomas Sun (tom dot ssf at gmail dot com)
> > * Yaliang Wang (yaliang dot w dot wang at ieee dot org)
> >
> > == Affiliations ==
> >
> > * Andrew Jorgensen (Google)
> > * Ashvin Agrawal (Microsoft)
> > * Avrilia Floratou (Microsoft)
> > * Bill Graham (Twitter)
> > * Brian Hatfield (Google)
> > * Chris Kellogg (Twitter)
> > * Huijun Wu (Twitter)
> > * Karthik Ramasamy (Streamlio)
> > * Maosong Fu (Twitter)
> > * Neng Lu (Twitter)
> > * Runhang Li (Twitter)
> > * Sanjeev Kulkarni (Streamlio)
> > * Supun Kamburugamuve (Indiana University)
> > * Thomas Sun (Twitter)
> > * Yaliang Wang (Twitter)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > * Julien Le Dem (julien at apache dot org)
> >
> > == Nominated Mentors ==
> >
> > * Jake Farrell (jfarrell at apache dot org)
> > * Jacques Nadeau (jacques at apache dot org)
> > * Julien Le Dem (julien at apache dot org)
> > * P. Taylor Goetz (ptgoetz at apache dot org)
> >
> > == Sponsoring Entity ==
> >
> > The Apache Incubator
> >
> > == Footnotes ==
> >
> > * 1 - Papers detailing Heron are available at
> > http://dl.acm.org/citation.cfm?id=2742788 and
> > http://sites.computer.org/debull/A15dec/p15.pdf.
> > * 2 - http://home.apache.org/phonebook.html?uid=billgraham
> > * 3 - http://home.apache.org/phonebook.html?uid=ashvin
> > * 4 - http://home.apache.org/phonebook.html?uid=supun
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >
> > <javascript:;>
> >
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> > <javascript:;>
> >
> >
> >
> >
> >
> >
> > --
> > Sorry this was sent from mobile. Will do less grammar and spell check
> than
> > usual.
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message