streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sblackmon <sblack...@apache.org>
Subject Re: [DISCUSS] Beam
Date Mon, 21 Nov 2016 21:43:19 GMT
 

On November 21, 2016 at 2:19:11 PM, Suneel Marthi (suneel.marthi@gmail.com(mailto:suneel.marthi@gmail.com))
wrote: 

> I agree too, I have been playing with Beam for a few months now without a  
> runner and the API is still immature, but nevertheless keep it on the radar  
> since its gonna be a TLP soon.  
>  
>  
> From Streams perspective, how do we see the project using Beam (similar to  
> Spark/flink now); if so we can preliminary version of Beam support with  
> Local Dataflow runner.  
>  

Hypothesis expanded:  

We could implement all the components in the project (providers, persister, and processors)
directly against  
Beam APIs (Source, Sink, DoFn, etc…) and support two primary execution models for project
capabilities:

1) direct instantiation of a single instance of a component, call beam equivalents of setup,
process, teardown yourself. This is common throughout project unit and integration tests already.
 
2) compose a beam Pipeline combining Streams and non-Streams components, run with your preferred
beam runner(s).

In this scenario I think streams-runtimes would either go away entirely or only contain helper
methods (no classes with a static main)  

>  
>  
> On Mon, Nov 21, 2016 at 3:14 PM, Trevor Grant  
> wrote:
>  
> > IMHO, Beam is too immature and the API is to unstable at this time to
> > integrate, however I am in favor of watching the Beam project develop and
> > starting to think through what an integration might look like.
> >
> > Just my .02, based on some fairly lack-luster experiences with Apache Beam.
> >
> > tg
> >
> >
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things." -Virgil*
> >
> >
> > On Mon, Nov 21, 2016 at 11:36 AM, sblackmon wrote:
> >
> > > Beam appears to be on it’s way to being the de-facto standard for data
> > > pipelines.
> > >
> > > I’d like to start a real discussion about whether and how to align
> > streams
> > > interfaces with Beam interfaces.
> > >
> > > To pose a straw-man theory for discussion:
> > >
> > > Hypothesis: Streams would benefit by replacing the interfaces in
> > > streams-core entirely with beam interfaces.
> > >
> > > a) Do we agree that the flexibility and performance gains from doing so,
> > > presuming it’s possible, would be significant?
> > > b) Are there any inevitable flexiblility, performance, complexity, or
> > > other, blockers or compromises we should discuss?
> > > c) What arguments are there for retaining our interfaces and providing
> > > beam compatibility in a runtime module binding (within streams) vs
> > > deprecating our existing interfaces and switching over completely?
> > > d) Obviously doing this would be a lot of work. What level of commitment
> > > is there from the group to work on this?
> > >
> > > Steve
> > > On October 25, 2016 at 3:47:11 PM, sblackmon (sblackmon@apache.org)
> > wrote:
> > >
> > > Regarding Beam, there have been a number of ideas and theories floated on
> > > the list and but nothing concrete has been proposed or discussed in
> > depth.
> > >
> > > Steve
> > > On October 25, 2016 at 10:21:52 AM, Suneel Marthi (
> > suneel.marthi@gmail.com)
> > > wrote:
> > >
> > > Is support for Kafka Streams and Apache Beam on the roadmap ?
> > >
> >


Mime
View raw message