flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Norman Spangenberg <wir12...@studserv.uni-leipzig.de>
Subject flink performance
Date Mon, 08 Sep 2014 12:48:56 GMT
I'm a bit confused about the performance of Flink.
My cluster consists of 4 nodes, each with 8 cores and 16gb memory (1.5 
gb reserved for OS). using flink-0.6 in standalone-cluster mode.
i played a little bit with the config-settings but without much impact 
on execution time.
jobmanager.rpc.port: 6123
jobmanager.heap.mb: 1024
taskmanager.heap.mb: 14336
taskmanager.memory.size: -1
taskmanager.numberOfTaskSlots: 4
parallelization.degree.default: 16
taskmanager.network.numberOfBuffers: 4096
fs.hdfs.hadoopconf: /opt/yarn/hadoop-2.4.0/etc/hadoop/

I tried two applications: wordcount and k-Means scala example code
wordcount needs 5 minutes for 25gb, and 13 minutes for 50gb.
kmeans (10 iterations) needs for 56mb input 86 seconds, but with 1.1gb 
input it needs 33minutes with 2.2gb nearly 90 minutes!

the monitoring tool ganglia says, that cpu has low cpu utilization and a 
lot of waiting time. in wordcount cpu utilizes with nearly 100 percent.
Is this a ordinary dimension of execution time in spark? or are 
optimizations in my config necessary? or maybe a bottleneck in the cluster?

i hope somebody could help me :)
greets Norman

View raw message