apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Weise <tho...@datatorrent.com>
Subject Re: Java stream api
Date Tue, 12 Apr 2016 15:57:39 GMT
This looks really good. It will give Apex the "beginner API" to build apps.

As already discussed, this API will be an opportunity to apply optimization
internally, so we should go with lazy DAG building approach. It is probably
a good idea to do this in the first cut so that other contributors will
follow the right pattern when adding more operations.

It will also be helpful to provide an example how the API can be extended
or operations easily overridden with customized operator.

It would be great if we can pull the initial rev into 3.4.0. This will open
it up for others contribution. It will be @Evolving, changes are expected.

Thomas


On Tue, Apr 5, 2016 at 12:41 PM, Tushar Gosavi <tushar@datatorrent.com>
wrote:

> On Tue, Apr 5, 2016 at 12:14 PM, Siyuan Hua <siyuan@datatorrent.com>
> wrote:
>
> > I have collected some open topics/questions for discussion already from
> > folks who already reviewed the code
> >
> > 1.The name, name of the Stream and the StreamSource
> >
> > 2. Build dag in an incremental way vs lazy population. Incremental way is
> > easier to implement (what I did right now) and it create one edge for the
> > dag for each transformation method. Lazy population means keep the method
> > chain in memory until it needs to submit the dag either locally or to the
> > cluster, in this way, some optimization(change order of transforation
> ex.)
> > might be done because you have an overall picture.
> >
> > +1 for lazy population. This way we could swap in better implementation
> for for combination of
> transformation.
>
>
>
> > 3. How to easily extend the Stream interface and it's implementation
> >
> > Can we add a factory using which we can generate a stream by adding input
> operator. The factory would help us to change the implementation of
> operators for different purpose. Like one Factory can be for stream
> processing, other Factory can be for batch processing.
>
>
> > 4. How to deal with operator with multiple input ports/output ports.
>
>
>
> >
>
> One way to handle that would be to add a new abstraction StreamSet which
> also implements Stream, and provides an additional method get("name"), When
> you add a operator it creates a streamSet, by default it will use first
> port of the operator for stream.  but user can get a required stream by
> calling get. For example suppose if parser has two ports one of them is
> error, we can get an error stream and start a different stream from it.
>
> stream.addOperator("1", new ParseOperator() { } ).get("error").map("map1",
> new MapFunction());
>
> I can't think of any solution now to support multiple input ports.
>
>
> > Again, I appreciate any ideas and suggestion for those questions above.
> >
> > And feel free to ask more questions you have
> >
> > Regards,
> > Siyuan
> >
> > On Mon, Apr 4, 2016 at 11:30 PM, Siyuan Hua <siyuan@datatorrent.com>
> > wrote:
> >
> > > Hi community,
> > >
> > > I have submitted my first commit of stream api into my public
> repository
> > > here
> > > https://github.com/siyuanh/incubator-apex-malhar/tree/stream
> > >
> > > You can think this is the prototype of the Java Stream API proposal
> here
> > >
> > >
> >
> https://docs.google.com/document/d/163LmQjX860b61NDe3ZzR0hRTPtE-4GF0iHaVhmHQssY/edit#heading=h.aytn6rz7u1e4
> > >
> > > A simple walkthrough of the code:
> > >
> > > ApexStream is the core interface to build a dag in stream style.
> Default
> > > implementation is in ApexStreamImpl
> > >
> > > Function is a super interface for all simple transformation, it has
> > > several sub interfaces like MapFunction, ReduceFunction etc.
> > >
> > > FunctionOperator is a wrapper for functions that pass param from input
> > > port to function and deliver the return value to output port.
> > >
> > > And you can find the word count demo code below
> > >
> > >
> > >
> >
> https://github.com/siyuanh/incubator-apex-malhar/blob/stream/demos/wordcount/src/main/java/com/datatorrent/demos/wordcount/ApplicationWithStreamAPI.java
> > >
> > >
> > >
> > > As we want to release this API asap. We want the whole community to
> help
> > > define a clear scope of what we want to achieve in the first cut. Any
> > > suggestions, ideas are very welcome.
> > >
> > > Please please do contribute to this :)
> > >
> > > Thanks!
> > > Siyuan
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message