flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijiang(wangzhijiang999)" <wangzhijiang...@aliyun.com>
Subject 回复:PartitionNotFoundException on deploying streaming job
Date Tue, 04 Apr 2017 16:03:31 GMT
Hi Kamil, 
     When the producer receives the PartitionRequest from downstream task, first it will
check whether the requested partition is already registered. If not, it will reponse PartitionNotFoundException.And
the upstream task is submitted and begins to run, it will registered all its partitions into
ResultPartitionManager. So your case is that the partition request is arrived before the partition
registration.Maybe the upstream task is submitted delay by JobManager or some logics delay
before register task in NetworkEnvironment. You can debug the specific status in upstream
when response the PartitionNotFound to track the reason. Wish your further findings!
------------------------------------------------------------------发件人:Kamil Dziublinski
<kamil.dziublinski@gmail.com>发送时间:2017年4月4日(星期二) 17:20收件人:user
<user@flink.apache.org>主 题:PartitionNotFoundException on deploying streaming
Hi guys,
When I run my streaming job I almost always have initially PartitionNotFoundException. Job
fails, after that restarts and it runs ok.I wonder what is causing that and if I can adjust
some parameters to not have this initial failure.
I have flink session on yarn with 55 task managers. 4 cores and 4gb per TM.This setup is using
77% of my yarn cluster.
Any ideas?
View raw message