flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Using Custom Partitioner in Streaming with parallelism 1 adding latency
Date Thu, 15 Jun 2017 13:06:29 GMT
Hi,
Before adding the partitioner all your functions are chained together. That is, everything
is executed in one Thread and sending elements from one function to the next is essentially
just a method call. By introducing a partitioner you break this chain and therefore your job
now has to send data (at least) across between different Threads. I think you should see this
if you look at the execution graph.

Best,
Aljoscha

> On 15. Jun 2017, at 14:35, sohimankotia <sohimankotia@gmail.com> wrote:
> 
> Hi,
> 
> I have streaming job which is running with parallelism 1 as of now .  (This
> job will run with parallelism > 1 in future )
> 
> So I have added custom partitioner to partition the data based on one tuple
> field .
> 
> The flow is  :
> 
> source -> map -> partitioner -> flatmap -> sink
> 
> The partitioner is adding around 40 sec or more delay for 40% of data (from
> map to flatmap ).
> 
> I am not able to understand if parallelism is one how this delay is getting
> added .
> 
> 
> 
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Using-Custom-Partitioner-in-Streaming-with-parallelism-1-adding-latency-tp13766.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.


Mime
View raw message