flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: Adding the streaming project to the main repository
Date Fri, 08 Aug 2014 19:15:59 GMT
Hey guys,

I might not be able to give you all the details right now, because some of
the data is on my colleague's computer, but I'm gonna try :)

We have a 30 machine cluster at SZTAKI with 2 cores each not a powerhouse
but good for experimenting.

We tested both Flink Streaming and Storm with a lot of different settings,
ranging from a couple machines to full cluster utilization. As for the
flink config, we used the default settings, and also the default settings
for Storm.

We ran two experiments, the first one was a simple streaming wordcount
example, implemented exactly the same way in both systems. We ran the
streaming jobs for about 15-30 minutes each for the experiments but we also
tested longer runs. The other test we ran was a streaming pagerank to test
the performance of the iterations (in storm you can just connect circles in
the topology, so we wanted to see how it compares to the out-of-topology
record passing). On both examples the Flink Streaming was on average about
3-4 times faster than the same implementation in storm. For storm fault
tolerance was turned of for the sake of the experiment, because that's
still an open issue for us.

You can check out the implementations here in this repo:

(there have been a lot of api changes this week compared to what you find
in this repo)

One of our colleagues who have implemented these also working on a neat gui
for doing performance tests, its pretty cool :)

Going back to Ufuk's question regarding what I meant by the role of the
output buffers. So one main difference between storm and flink streaming
could be the way that output buffers are handled. In storm you set the
"output buffer" size for the number of records, which is by default
relatively small for better latency. In flink the output buffer is
typically much larger which gives you higher throughput (that's what we
were measuring). To be able to provide some latency constraint we
implemented a RecordWriter that flushes the output every predefined number
of milliseconds (or when the buffer is full). So this seems to be a better
way of going about latency guarantee, than setting a preset buffer size
with the number of records.

I think this is similar case as with the comparison of Storm and Spark. Of
course Spark has much higher throughput, but the question is how can we
fine tune the trade-off between latency and throughput, but we will need
more tests to answer these questions :)

But I think there are more performance improvements to come, we are
planning on doing similar topology optimization as the batch api in the



On Fri, Aug 8, 2014 at 4:59 PM, Ufuk Celebi <u.celebi@fu-berlin.de> wrote:

> On 08 Aug 2014, at 16:07, Kostas Tzoumas <kostas.tzoumas@tu-berlin.de>
> wrote:
> > Wow! Incredible :-) Can you share more details about the experiments you
> > ran (cluster setup, jobs, etc)?
> Same here. :-)
> I would be especially interested about what you mean with "partly because
> of the output buffers".
> Best wishes,
> Ufuk

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message