flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: Adding the streaming project to the main repository
Date Fri, 04 Jul 2014 18:28:34 GMT

I would like to give a quick update on the status of the flink streaming
project; all of our dependencies are now updated to the current
0.6-snapshot in our main branch, and the project is now decomposed into 3
subprojects: core, examples, and addons.

We have created a separate branch for our 0.2 release with dependencies to
0.5, however from now on we will focus our development efforts to be able
to merge our main branch with the Main Flink project.


Gyula, Márton & Gábor

On Wed, Jul 2, 2014 at 2:05 PM, Márton Balassi <mbalassi@ilab.sztaki.hu>

> To extend the functionality of Flink a separate branch of development was
> dedicated for low latency, distributed stream processing support. The
> development started during March of 2014 and is approaching a state where
> it might be considered a candidate for becoming part of the main
> repository.
> As of today a WordCount
> <
> https://github.com/stratosphere/stratosphere-streaming/blob/master/src/main/java/eu/stratosphere/streaming/examples/wordcount/WordCountLocal.java#L30-41
> >
> example streaming program would fairly similar to the one that the batch
> API provides:
> StreamExecutionEnvironment env = new StreamExecutionEnvironment();
> DataStream<Tuple2<String, Integer>> dataStream =
>   env.readTextFile("src/test/resources/testdata/hamlet.txt")
>                                                          .flatMap(new
> WordCountSplitter())
>                                                          .partitionBy(0)
>                                                          .map(new
> WordCountCounter());
> dataStream.print();
> env.execute();
> The user defined functions are extending the same classes as in the batch
> case (e.g. a FlatMapFunction for a flatmap, see WordCountSplitter
> <
> https://github.com/stratosphere/stratosphere-streaming/blob/master/src/main/java/eu/stratosphere/streaming/examples/wordcount/WordCountSplitter.java
> >)
> thus providing code interusability between the two approaches.
> As for performance the 0.1 version
> <https://github.com/stratosphere/stratosphere-streaming/tree/release-0.1>
> released in the beginning of June was slightly better on a single core then
> Apache Storm, one of the major players of the field. Cluster performance
> needs further optimization. This version provided a lower level API, fairly
> similar to the one Storm has. For a deeper dive on this state of the
> development and the challenges faced please refer to the slides
> <http://info.ilab.sztaki.hu/~mbalassi/dw_forum_2014/dw_forum_streaming.pdf
> >
> of a talk form the early days of June.
> The 0.2 release is coming soon with the the above demonstrated new API and
> improved single core performance. To complete the release the cluster
> performance is being measured, and the code is being decomposed into three
> subprojects separating core, example and addon functionality.
> As for the future fault tolerance is an unresolved issue and as a part of
> the Google Summer of Code project an intern is working on iterative stream
> processing.
> The project is mainly developed at Budapest by three members employed by
> Hungarian Academy of Sciences and Eötvös Loránd University and Frank Wu,
> our Google Summer of Code student from Singapore. This summer the Hungarian
> Academy of Sciences also dedicated 4 interns to the project.
> The proposed 0.2 release is still dependant on the 0.5 release of
> Stratosphere, however on branch snapshot-0.6
> <https://github.com/stratosphere/stratosphere-streaming/tree/snapshot-0.6>
> the
> dependencies are updated to 0.6-snapshot, thus the codebase is ready for
> becoming part of the main project - preferably a part of addons until it
> becomes stable.
> Looking forward to your suggestions.
> Cheers,
> Márton, Gyula & Gábor

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message