flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Using numberOfTaskSlots to control parallelism
Date Sat, 20 Feb 2016 09:12:13 GMT
IMHO the only change for 2) is that you possibly get better machine utilization because it
will use more parallel threads.  So I think it’s a valid approach.

@Ufuk, could there be problems with the number of network buffers? I think not, because the
connections are multiplexed in one channel, is this correct?

I’ll also talk with the others so see if we can resolve the watermark/kafka partition issues
before the 1.0 release.
> On 20 Feb 2016, at 02:14, Zach Cox <zcox522@gmail.com> wrote:
> 
> What would the differences be between these scenarios?
> 
> 1) one task manager with numberOfTaskSlots=1 and one job with parallelism=1
> 
> 2) one task manager with numberOfTaskSlots=10 and one job with parallelism=10
> 
> In both cases all of the job's tasks get executed within the one task manager's jvm.
Are there any downsides to doing #2 instead of #1?
> 
> I ask this question because of current issues related to watermarks with Kafka sources
[1] [2] and changing parallelism with savepoints [3]. I'm writing a Flink job that processes
events from Kafka topics that have 12 partitions. I'm wondering if I should just set the job
parallelism=12 and make numberOfTaskSlots sum to 12 across however many task managers I set
up. It seems like watermarks would work properly then, and I could effectively change job
parallelism using the number of task managers (e.g. 1 TM with slots=12, or 2 TMs with slots=6,
or 12 TMs with slots=1, etc).
> 
> Am I missing any important details that would make this a bad idea? It seems like a bit
of abuse of numberOfTaskSlots, but also seems like a fairly simple solution to a few current
issues.
> 
> Thanks,
> Zach
> 
> [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-partition-alignment-for-event-time-tt4782.html
> [2] https://issues.apache.org/jira/browse/FLINK-3375
> [3] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Changing-parallelism-tt4967.html
> 


Mime
View raw message