flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Hogan <c...@greghogan.com>
Subject Re: Performance and Latency Chart for Flink
Date Mon, 19 Sep 2016 19:43:24 GMT
You will need to add the configuration parameters to your flink-conf.yaml.
I believe the intent is that all configuration parameters should be listed
at

https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#full-reference

My understanding is that the Flink buffers are currently copied to Netty
buffers, although I don't understand the stated memory doubling.


On Mon, Sep 19, 2016 at 3:08 PM, amir bahmanyari <
amirtousa@yahoo.com.invalid> wrote:

> Hi Greg,In the same Flink config link below, there are parameters that
> dont even exist in flink-conf.yaml.Are they defined somewhere else?I
> grepped the followings & none existed in any of the files under conf
> folder."taskmanager.memory.fraction", taskmanager.memory.off
> -heap, taskmanager.memory.segment-size & many more.
> Also, isnt the example calculating the network buffers wrong? Based on the
> example, roughly 5000 buffers x 32KiB = 160000 KiB should be
> allocated.160000 KiB divided by 1024 = 156.25 MiB. Why is the example
> saying "the system would allocate roughly 300 MiBytes for network buffers."
> ?Thats roughly twice as much. Am i Missing something here?I still need your
> help to set the accurate number for my
>    - taskmanager.network.numberOfBuffers = 4096.
>
> Thanks for your response Greg.Amir-      From: amir bahmanyari <
> amirtousa@yahoo.com>
>  To: "dev@flink.apache.org" <dev@flink.apache.org>
>  Sent: Monday, September 19, 2016 10:34 AM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Hi Greg,I used this guideline to calculate "taskmanager.network.numberOfBuffers":Apache
> Flink 1.2-SNAPSHOT Documentation: Configuration
>
>
> |
> |
> |
> |   |    |
>
>   |
>
>   |
> |
> |   |
> Apache Flink 1.2-SNAPSHOT Documentation: Configuration
>    |   |
>
>   |
>
>   |
>
>
>
> 4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
> is there in the formula.What would you set it to? Once I have that number,
> I will set  "taskmanager.memory.preallocate" to true & will give it
> another shot.Thanks Greg
>
>       From: Greg Hogan <code@greghogan.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirtousa@yahoo.com>
>  Sent: Monday, September 19, 2016 8:29 AM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Hi Amir,
>
> You may see improved performance setting "taskmanager.memory.preallocate:
> true" in order to use off-heap memory.
>
> Also, your number of buffers looks quite low and you may want to increase
> "taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128
> MiB.
>
> As this is a only benchmark are you able to post the code to github to
> solicit feedback?
>
> Greg
>
> On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
> amirtousa@yahoo.com.invalid> wrote:
>
> > I have new findings & subsequently relative improvements.Am testing as we
> > speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
> > keep state somewhere. I went with Redis. I found it to be a major bottle
> > neck as Beam nodes constantly are going across NW to update its
> > repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
> > Then Kafka went out of disk space and the replication manager
> > complained. So I clustered the two Kafka nodes hoping for sharing space.
> As
> > of this second I am typing this email, its sustaining but only 1/2 of
> > the 201401969  tuples have been processed after 3.5 hours.According to
> the
> > Linear Road benchmarking expectations, if your system is working well,
> this
> > whole 201401969  tuples must be done in 3.5 hrs max.So this means there
> is
> > still room for tuning Flink nodes. I have already shared with you all
> more
> > details about my config.It run perfect yesterday with almost 1/10th of
> this
> > load. Perfect real-time send/processed streaming behavior.If thats the
> case
> > & I cannot get better performance with FlinkRunner, my nest stop is
> > SparkRunner and repeat of the whole thing for final benchmarking of the
> two
> > under Beam APIs.Which was the initial intent anyways.If you have
> > suggestions to make improvements in the above case, I am all ears &
> greatly
> > appreciate it.Cheers,Amir-
> >
> >      From: "Chawla,Sumit" <sumitkchawla@gmail.com>
> >  To: dev@flink.apache.org; amir bahmanyari <amirtousa@yahoo.com>
> >  Sent: Sunday, September 18, 2016 2:07 PM
> >  Subject: Re: Performance and Latency Chart for Flink
> >
> > Has anyone else run these kind of benchmarks?  Would love to hear more
> > people'e experience and details about those benchmarks.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkchawla@gmail.com>
> > wrote:
> >
> > > Hi Amir
> > >
> > > Would it be possible for you to share the numbers? Also share if
> possible
> > > your configuration details.
> > >
> > > Regards
> > > Sumit Chawla
> > >
> > >
> > > On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> > > amirtousa@yahoo.com.invalid> wrote:
> > >
> > >> Hi Fabian,FYI. This is report on other engines we did the same type of
> > >> bench-marking.Also explains what Linear Road bench-marking is.Thanks
> for
> > >> your help.
> > >> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
> > >> linear-road-benchmark
> > >> https://github.com/IBMStreams/benchmarks
> > >> https://www.datatorrent.com/blog/blog-implementing-linear-ro
> > >> ad-benchmark-in-apex/
> > >>
> > >>
> > >>      From: Fabian Hueske <fhueske@gmail.com>
> > >>  To: "dev@flink.apache.org" <dev@flink.apache.org>
> > >>  Sent: Friday, September 16, 2016 12:31 AM
> > >>  Subject: Re: Performance and Latency Chart for Flink
> > >>
> > >> Hi,
> > >>
> > >> I am not aware of periodic performance runs for the Flink releases.
> > >> I know a few benchmarks which have been published at different points
> in
> > >> time like [1], [2], and [3] (you'll probably find more).
> > >>
> > >> In general, fair benchmarks that compare different systems (if there
> is
> > >> such thing) are very difficult and the results often depend on the use
> > >> case.
> > >> IMO the best option is to run your own benchmarks, if you have a
> > concrete
> > >> use case.
> > >>
> > >> Best, Fabian
> > >>
> > >> [1] 08/2015:
> > >> http://data-artisans.com/high-throughput-low-latency-and-exa
> > >> ctly-once-stream-processing-with-apache-flink/
> > >> [2] 12/2015:
> > >> https://yahooeng.tumblr.com/post/135321837876/benchmarking-
> > >> streaming-computation-engines-at
> > >> [3] 02/2016:
> > >> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
> > >>
> > >>
> > >> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkchawla@gmail.com>:
> > >>
> > >> > Hi
> > >> >
> > >> > Is there any performance run that is done for each Flink release?
Or
> > you
> > >> > are aware of any third party evaluation of performance metrics for
> > >> Flink?
> > >> > I am interested in seeing how performance has improved over release
> to
> > >> > release, and performance vs other competitors.
> > >> >
> > >> > Regards
> > >> > Sumit Chawla
> > >> >
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> >
> >
> >
> >
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message