flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Tuning parallelism in cascading-flink planner
Date Wed, 27 Apr 2016 04:25:12 GMT
Hi all,

I’m busy tuning up a workflow (defined w/Cascading, planned with Flink) that runs on a 5
slave EMR cluster.

The default parallelism (from the Flink planner) is set to 40, since I’ve got 5 task managers
(one per node) and 8 slots/TM.

But this seems to jam things up, as I see simultaneous GroupReduce subtasks competing for
resources (or so it seems).

Any insight into how to tune this?

And what’s the right way to set it on a sub-task basis? With Cascading Flows planned for
M-R I can set the number of reducers via a Hadoop JobConf configuration setting, on a per-step
(to use Cascading lingo) basis. But with a Flow planned for Flink, there’s only a single


— Ken

View raw message