flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahantesh Patil <mahantesh...@gmail.com>
Subject Re: Flink Job sometimes does not stop as expected on cancelling.
Date Fri, 15 Feb 2019 16:22:03 GMT
Hi Team,

Following up to check if any of you have faced this issue.  If so how to
resolve this.

Regards,
Mahantesh Patil.

On Mon, Feb 11, 2019 at 3:50 PM Mahantesh Patil <mahantesh.np@gmail.com>
wrote:

> Hi Team,
>
> We have Flink jobs running in cluster mode. When I cancel the job and
> check for status it still shows as running.
>
> Below is logs generated.  I do not see any useful information. Could you
> guys point me in a right direction to debug this , our flink jobs do heavy
> processing , my understanding is on cancelling flink waits till operator
> completes processing to store save point. So is it happening because one of
> the operator is hung waiting for resource and is there a way to configure
> timeout.
>
>
> Appreciate your help.
>
>
>
> 10 Jan 2019 20:36:51,490 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code Source: control -> control-filter
> (1/1) (c6f50328bb09ce1ad389e5ea614fc8a9).
> 10 Jan 2019 20:36:51,490 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task interaction-details (1/1)
> (dc946a863e34f66a95df3a8e8d540cab).
> 10 Jan 2019 20:36:51,490 INFO org.apache.flink.runtime.taskmanager.Task -
> interaction-details (1/1) (dc946a863e34f66a95df3a8e8d540cab) switched from
> RUNNING to CANCELING.
> 10 Jan 2019 20:36:51,491 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code interaction-details (1/1)
> (dc946a863e34f66a95df3a8e8d540cab).
> 10 Jan 2019 20:36:51,491 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task current-longests (1/1)
> (2f1cee63ab479a8e9e492c990311ccc1).
> 10 Jan 2019 20:36:51,491 INFO org.apache.flink.runtime.taskmanager.Task -
> current-longests (1/1) (2f1cee63ab479a8e9e492c990311ccc1) switched from
> RUNNING to CANCELING.
> 10 Jan 2019 20:36:51,491 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code current-longests (1/1)
> (2f1cee63ab479a8e9e492c990311ccc1).
> 10 Jan 2019 20:36:51,492 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task interactions-count (1/1)
> (aff734ce398978cdc3ac40a1e530d0ed).
> 10 Jan 2019 20:36:51,492 INFO org.apache.flink.runtime.taskmanager.Task -
> interactions-count (1/1) (aff734ce398978cdc3ac40a1e530d0ed) switched from
> RUNNING to CANCELING.
> 10 Jan 2019 20:36:51,628 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code interactions-count (1/1)
> (aff734ce398978cdc3ac40a1e530d0ed).
> 10 Jan 2019 20:36:51,680 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task agents-working-on-interactions (1/1)
> (15aa5e81a66a46127ab85719b9df7279).
> 10 Jan 2019 20:36:51,680 INFO org.apache.flink.runtime.taskmanager.Task -
> agents-working-on-interactions (1/1) (15aa5e81a66a46127ab85719b9df7279)
> switched from RUNNING to CANCELING.
> 10 Jan 2019 20:36:51,680 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code agents-working-on-interactions (1/1)
> (15aa5e81a66a46127ab85719b9df7279).
> 10 Jan 2019 20:36:51,811 ERROR
> org.apache.flink.streaming.runtime.tasks.StreamTask - Could not shut down
> timer service
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2067)
>         at
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
>         at
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService.shutdownAndAwaitPending(SystemProcessingTimeService.java:197)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:317)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>         at java.lang.Thread.run(Thread.java:748)
> 10 Jan 2019 20:36:51,811 ERROR
> org.apache.flink.streaming.runtime.tasks.StreamTask - Could not shut down
> timer service
> java.lang.InterruptedException
>         at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2067)
>         at
> java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
>         at
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService.shutdownAndAwaitPending(SystemProcessingTimeService.java:197)
>         at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:317)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>         at java.lang.Thread.run(Thread.java:748)
> 10 Jan 2019 20:36:51,813 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task subscription-management (1/1)
> (f5e529744e645e6acf00d2c5aab2006e).
> 10 Jan 2019 20:37:07,156 INFO org.apache.flink.runtime.taskmanager.Task -
> subscription-management (1/1) (f5e529744e645e6acf00d2c5aab2006e) switched
> from RUNNING to CANCELING.
> 10 Jan 2019 20:37:07,157 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code subscription-management (1/1)
> (f5e529744e645e6acf00d2c5aab2006e).
> 10 Jan 2019 20:37:07,164 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task filter-business-metrics -> Sink: data_feed (1/1)
> (6c229624ea4735215fe98f5458c58c8e).
> 10 Jan 2019 20:37:07,164 INFO org.apache.flink.runtime.taskmanager.Task -
> filter-business-metrics -> Sink: data_feed (1/1)
> (6c229624ea4735215fe98f5458c58c8e) switched from RUNNING to CANCELING.
> 10 Jan 2019 20:37:07,164 INFO org.apache.flink.runtime.taskmanager.Task -
> Triggering cancellation of task code filter-business-metrics -> Sink:
> data_feed (1/1) (6c229624ea4735215fe98f5458c58c8e).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task interaction-details (1/1)
> (dc946a863e34f66a95df3a8e8d540cab).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task interaction-details (1/1) is already in state CANCELING
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task current-longests (1/1)
> (2f1cee63ab479a8e9e492c990311ccc1).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task current-longests (1/1) is already in state CANCELING
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task interactions-count (1/1)
> (aff734ce398978cdc3ac40a1e530d0ed).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task interactions-count (1/1) is already in state CANCELING
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task agents-working-on-interactions (1/1)
> (15aa5e81a66a46127ab85719b9df7279).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task agents-working-on-interactions (1/1) is already in state CANCELING
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task subscription-management (1/1)
> (f5e529744e645e6acf00d2c5aab2006e).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task subscription-management (1/1) is already in state CANCELING
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Attempting to cancel task filter-business-metrics -> Sink: data_feed (1/1)
> (6c229624ea4735215fe98f5458c58c8e).
> 10 Jan 2019 20:37:07,182 INFO org.apache.flink.runtime.taskmanager.Task -
> Task filter-business-metrics -> Sink: data_feed (1/1) is already in state
> CANCELING
>
>
>
>
>
> Regards,
> Mahantesh Patil.
>

Mime
View raw message