streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Franklin <m.ben.frank...@gmail.com>
Subject Re: [DISCUSS] Beam
Date Wed, 23 Nov 2016 17:03:03 GMT
On Mon, Nov 21, 2016 at 6:27 PM Joey Frazee <joey.frazee@icloud.com> wrote:

> I'm in favor of this for a few reasons:
>
> - There are enough stream processing frameworks out there that it makes it
> hard for us to offer much on that front. I don't think streams fills a gap
> for this internally so we have more to contribute in creating something
> that people can use with Beam.
>
> - It should help make the story clearer to outsiders on "how to run
> streams".
>
> - While it may be immature, as Trevor and Suneel mention, I think they can
> probably do as good a job keeping the interfaces stable as we can in
> maintaining runtimes and interfaces internally. We'll give up some control
> but they'll do a good job too.
>

I am +1 for finding another (defacto) standard for stream processing APIs;
but I also worry about a hard dependency on Beam.  It would be nice if
there were another alternative similar to ReactiveStreams[1] but with a
more flexible model.

[1] http://www.reactive-streams.org/


> Now there will for sure be some drawbacks. We'll be beholden to someone
> else and probably have to scramble to stay up to date sometimes. And it's
> naive to think it's ever going to provide for every feature of the
> underlying runner, so we might find ourselves in situations where something
> that should be easy is hard.
>
> -joey
>
> > On Nov 21, 2016, at 3:43 PM, sblackmon <sblackmon@apache.org> wrote:
> >
> >
> >
> >> On November 21, 2016 at 2:19:11 PM, Suneel Marthi (
> suneel.marthi@gmail.com(mailto:suneel.marthi@gmail.com)) wrote:
> >>
> >> I agree too, I have been playing with Beam for a few months now without
> a
> >> runner and the API is still immature, but nevertheless keep it on the
> radar
> >> since its gonna be a TLP soon.
> >>
> >>
> >> From Streams perspective, how do we see the project using Beam (similar
> to
> >> Spark/flink now); if so we can preliminary version of Beam support with
> >> Local Dataflow runner.
> >>
> >
> > Hypothesis expanded:
> >
> > We could implement all the components in the project (providers,
> persister, and processors) directly against
> > Beam APIs (Source, Sink, DoFn, etc…) and support two primary execution
> models for project capabilities:
> >
> > 1) direct instantiation of a single instance of a component, call beam
> equivalents of setup, process, teardown yourself. This is common throughout
> project unit and integration tests already.
> > 2) compose a beam Pipeline combining Streams and non-Streams components,
> run with your preferred beam runner(s).
> >
> > In this scenario I think streams-runtimes would either go away entirely
> or only contain helper methods (no classes with a static main)
> >
> >>
> >>
> >> On Mon, Nov 21, 2016 at 3:14 PM, Trevor Grant
> >> wrote:
> >>
> >>> IMHO, Beam is too immature and the API is to unstable at this time to
> >>> integrate, however I am in favor of watching the Beam project develop
> and
> >>> starting to think through what an integration might look like.
> >>>
> >>> Just my .02, based on some fairly lack-luster experiences with Apache
> Beam.
> >>>
> >>> tg
> >>>
> >>>
> >>>
> >>>
> >>> Trevor Grant
> >>> Data Scientist
> >>> https://github.com/rawkintrevo
> >>> http://stackexchange.com/users/3002022/rawkintrevo
> >>> http://trevorgrant.org
> >>>
> >>> *"Fortunate is he, who is able to know the causes of things." -Virgil*
> >>>
> >>>
> >>>> On Mon, Nov 21, 2016 at 11:36 AM, sblackmon wrote:
> >>>>
> >>>> Beam appears to be on it’s way to being the de-facto standard for
data
> >>>> pipelines.
> >>>>
> >>>> I’d like to start a real discussion about whether and how to align
> >>> streams
> >>>> interfaces with Beam interfaces.
> >>>>
> >>>> To pose a straw-man theory for discussion:
> >>>>
> >>>> Hypothesis: Streams would benefit by replacing the interfaces in
> >>>> streams-core entirely with beam interfaces.
> >>>>
> >>>> a) Do we agree that the flexibility and performance gains from doing
> so,
> >>>> presuming it’s possible, would be significant?
> >>>> b) Are there any inevitable flexiblility, performance, complexity, or
> >>>> other, blockers or compromises we should discuss?
> >>>> c) What arguments are there for retaining our interfaces and providing
> >>>> beam compatibility in a runtime module binding (within streams) vs
> >>>> deprecating our existing interfaces and switching over completely?
> >>>> d) Obviously doing this would be a lot of work. What level of
> commitment
> >>>> is there from the group to work on this?
> >>>>
> >>>> Steve
> >>>> On October 25, 2016 at 3:47:11 PM, sblackmon (sblackmon@apache.org)
> >>> wrote:
> >>>>
> >>>> Regarding Beam, there have been a number of ideas and theories
> floated on
> >>>> the list and but nothing concrete has been proposed or discussed in
> >>> depth.
> >>>>
> >>>> Steve
> >>>> On October 25, 2016 at 10:21:52 AM, Suneel Marthi (
> >>> suneel.marthi@gmail.com)
> >>>> wrote:
> >>>>
> >>>> Is support for Kafka Streams and Apache Beam on the roadmap ?
> >>>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message