apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tushar Gosavi <tus...@datatorrent.com>
Subject Re: Java stream api
Date Tue, 05 Apr 2016 19:41:11 GMT
On Tue, Apr 5, 2016 at 12:14 PM, Siyuan Hua <siyuan@datatorrent.com> wrote:

> I have collected some open topics/questions for discussion already from
> folks who already reviewed the code
>
> 1.The name, name of the Stream and the StreamSource
>
> 2. Build dag in an incremental way vs lazy population. Incremental way is
> easier to implement (what I did right now) and it create one edge for the
> dag for each transformation method. Lazy population means keep the method
> chain in memory until it needs to submit the dag either locally or to the
> cluster, in this way, some optimization(change order of transforation ex.)
> might be done because you have an overall picture.
>
> +1 for lazy population. This way we could swap in better implementation
for for combination of
transformation.



> 3. How to easily extend the Stream interface and it's implementation
>
> Can we add a factory using which we can generate a stream by adding input
operator. The factory would help us to change the implementation of
operators for different purpose. Like one Factory can be for stream
processing, other Factory can be for batch processing.


> 4. How to deal with operator with multiple input ports/output ports.



>

One way to handle that would be to add a new abstraction StreamSet which
also implements Stream, and provides an additional method get("name"), When
you add a operator it creates a streamSet, by default it will use first
port of the operator for stream.  but user can get a required stream by
calling get. For example suppose if parser has two ports one of them is
error, we can get an error stream and start a different stream from it.

stream.addOperator("1", new ParseOperator() { } ).get("error").map("map1",
new MapFunction());

I can't think of any solution now to support multiple input ports.


> Again, I appreciate any ideas and suggestion for those questions above.
>
> And feel free to ask more questions you have
>
> Regards,
> Siyuan
>
> On Mon, Apr 4, 2016 at 11:30 PM, Siyuan Hua <siyuan@datatorrent.com>
> wrote:
>
> > Hi community,
> >
> > I have submitted my first commit of stream api into my public repository
> > here
> > https://github.com/siyuanh/incubator-apex-malhar/tree/stream
> >
> > You can think this is the prototype of the Java Stream API proposal  here
> >
> >
> https://docs.google.com/document/d/163LmQjX860b61NDe3ZzR0hRTPtE-4GF0iHaVhmHQssY/edit#heading=h.aytn6rz7u1e4
> >
> > A simple walkthrough of the code:
> >
> > ApexStream is the core interface to build a dag in stream style. Default
> > implementation is in ApexStreamImpl
> >
> > Function is a super interface for all simple transformation, it has
> > several sub interfaces like MapFunction, ReduceFunction etc.
> >
> > FunctionOperator is a wrapper for functions that pass param from input
> > port to function and deliver the return value to output port.
> >
> > And you can find the word count demo code below
> >
> >
> >
> https://github.com/siyuanh/incubator-apex-malhar/blob/stream/demos/wordcount/src/main/java/com/datatorrent/demos/wordcount/ApplicationWithStreamAPI.java
> >
> >
> >
> > As we want to release this API asap. We want the whole community to help
> > define a clear scope of what we want to achieve in the first cut. Any
> > suggestions, ideas are very welcome.
> >
> > Please please do contribute to this :)
> >
> > Thanks!
> > Siyuan
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message