flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9413) Tasks can fail with PartitionNotFoundException if consumer deployment takes too long
Date Mon, 04 Jun 2018 08:35:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499895#comment-16499895
] 

ASF GitHub Bot commented on FLINK-9413:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/6103
  
    How critical is it to change this setting? 
    I would assume this should be caught by the regular recovery, so unless this occurs very
often and thus leads to confusing exceptions in the log, should we maybe leave it as it is?


> Tasks can fail with PartitionNotFoundException if consumer deployment takes too long
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-9413
>                 URL: https://issues.apache.org/jira/browse/FLINK-9413
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.4.0, 1.5.0, 1.6.0
>            Reporter: Till Rohrmann
>            Assignee: mingleizhang
>            Priority: Critical
>
> {{Tasks}} can fail with a {{PartitionNotFoundException}} if the deployment of the producer
takes too long. More specifically, if it takes longer than the {{taskmanager.network.request-backoff.max}},
then the {{Task}} will give up and fail.
> The problem is that we calculate the {{InputGateDeploymentDescriptor}} for a consuming
task once the producer has been assigned a slot but we do not wait until it is actually running.
The problem should be fixed if we wait until the task is in state {{RUNNING}} before assigning
the result partition to the consumer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message