flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: Adding the streaming project to the main repository
Date Fri, 08 Aug 2014 19:55:25 GMT
Okay, lets have a chat about this sometimes, maybe we can come up with
something even better.

I already talked briefly about this with Stephan about the possibility of
runtime adjusting the output buffers but it seemed like the automatic
flushing was a far easier and possibly even better choice. (it shouldnt
even cause a significant overhead since we are just sleeping in another
thread)

It would also be cool if you could help introduce us deeper into this
network layer because its really a lot of code to look at without some
guide :D



On Fri, Aug 8, 2014 at 9:31 PM, Ufuk Celebi <u.celebi@fu-berlin.de> wrote:

> Thanks for the detailed explanation! Very nice to hear. :)
>
> If your flushing writer does not give you enough control for the trade off
> (in general you cannot know how large records will be, right?) we can have
> a chat about runtime changes for this. I would be happy to help with it.
>
> In theory it should not be a problem, because we don't rely on fixed
> buffer sizes and always serialize the buffer size with the network data.
>
> The only immediate problem that comes to my mind would be when the sent
> buffer is too large for the receiving buffers, but I think we could come up
> with a solution. ;)
>
> > On 08 Aug 2014, at 21:15, Gyula Fóra <gyula.fora@gmail.com> wrote:
> >
> > Hey guys,
> >
> > I might not be able to give you all the details right now, because some
> of
> > the data is on my colleague's computer, but I'm gonna try :)
> >
> > We have a 30 machine cluster at SZTAKI with 2 cores each not a powerhouse
> > but good for experimenting.
> >
> > We tested both Flink Streaming and Storm with a lot of different
> settings,
> > ranging from a couple machines to full cluster utilization. As for the
> > flink config, we used the default settings, and also the default settings
> > for Storm.
> >
> > We ran two experiments, the first one was a simple streaming wordcount
> > example, implemented exactly the same way in both systems. We ran the
> > streaming jobs for about 15-30 minutes each for the experiments but we
> also
> > tested longer runs. The other test we ran was a streaming pagerank to
> test
> > the performance of the iterations (in storm you can just connect circles
> in
> > the topology, so we wanted to see how it compares to the out-of-topology
> > record passing). On both examples the Flink Streaming was on average
> about
> > 3-4 times faster than the same implementation in storm. For storm fault
> > tolerance was turned of for the sake of the experiment, because that's
> > still an open issue for us.
> >
> > You can check out the implementations here in this repo:
> >
> > https://github.com/mbalassi/streaming-performance
> > (there have been a lot of api changes this week compared to what you find
> > in this repo)
> >
> > One of our colleagues who have implemented these also working on a neat
> gui
> > for doing performance tests, its pretty cool :)
> >
> > Going back to Ufuk's question regarding what I meant by the role of the
> > output buffers. So one main difference between storm and flink streaming
> > could be the way that output buffers are handled. In storm you set the
> > "output buffer" size for the number of records, which is by default
> > relatively small for better latency. In flink the output buffer is
> > typically much larger which gives you higher throughput (that's what we
> > were measuring). To be able to provide some latency constraint we
> > implemented a RecordWriter that flushes the output every predefined
> number
> > of milliseconds (or when the buffer is full). So this seems to be a
> better
> > way of going about latency guarantee, than setting a preset buffer size
> > with the number of records.
> >
> > I think this is similar case as with the comparison of Storm and Spark.
> Of
> > course Spark has much higher throughput, but the question is how can we
> > fine tune the trade-off between latency and throughput, but we will need
> > more tests to answer these questions :)
> >
> > But I think there are more performance improvements to come, we are
> > planning on doing similar topology optimization as the batch api in the
> > future.
> >
> >
> > Cheers,
> >
> > Gyula
> >
> >
> >
> >
> >
> >> On Fri, Aug 8, 2014 at 4:59 PM, Ufuk Celebi <u.celebi@fu-berlin.de>
> wrote:
> >>
> >>
> >> On 08 Aug 2014, at 16:07, Kostas Tzoumas <kostas.tzoumas@tu-berlin.de>
> >> wrote:
> >>
> >>> Wow! Incredible :-) Can you share more details about the experiments
> you
> >>> ran (cluster setup, jobs, etc)?
> >>
> >> Same here. :-)
> >>
> >> I would be especially interested about what you mean with "partly
> because
> >> of the output buffers".
> >>
> >> Best wishes,
> >>
> >> Ufuk
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message