flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5462) Flink job fails due to java.util.concurrent.CancellationException while snapshotting
Date Wed, 18 Jan 2017 15:32:26 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828276#comment-15828276
] 

Robert Metzger commented on FLINK-5462:
---------------------------------------

I've looked through this and other logs with the same error, and I think the problem is that
this error "only" occurs in the presence of other failures not as a root cause for an issue.

I don't think that we need to urgently fix this issue for the 1.2 release.

> Flink job fails due to java.util.concurrent.CancellationException while snapshotting
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-5462
>                 URL: https://issues.apache.org/jira/browse/FLINK-5462
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0
>            Reporter: Robert Metzger
>         Attachments: application-1484132267957-0005
>
>
> I'm using Flink 699f4b0.
> My restored, rescaled Flink job failed while creating a checkpoint with the following
exception:
> {code}
> 2017-01-11 18:46:49,853 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator
    - Triggering checkpoint 3 @ 1484160409846
> 2017-01-11 18:49:50,111 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph
       - TriggerWindow(TumblingEventTimeWindows(4), ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
EventTimeTrigger(), WindowedStream
> .apply(AllWindowedStream.java:440)) (1/1) (2accc6ca2727c4f7ec963318fbd237e9) switched
from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 for operator
TriggerWindow(TumblingEventTimeWindows(4), ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
>         at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for operator TriggerWindow(TumblingEventTimeWindows(4),
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
EventTimeTrigger(), WindowedStream.apply(AllWind
> owedStream.java:440)) (1/1).
>         ... 6 more
> Caused by: java.util.concurrent.CancellationException
>         at java.util.concurrent.FutureTask.report(FutureTask.java:121)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>         at org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
>         at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
>         ... 5 more
> 2017-01-11 18:49:50,113 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph
       - Job Generate Event Window stream (90859d392c1da472e07695f434b332ef) switched from
state RUNNING to FAILING.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 for operator
TriggerWindow(TumblingEventTimeWindows(4), ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
>         at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for operator TriggerWindow(TumblingEventTimeWindows(4),
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1).
>         ... 6 more
> Caused by: java.util.concurrent.CancellationException
>         at java.util.concurrent.FutureTask.report(FutureTask.java:121)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>         at org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
>         at org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
>         ... 5 more
> 2017-01-11 18:49:50,122 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph
       - Source: Custom Source -> Timestamps/Watermarks (1/2) (e52c1211b5693552f5908b0082c80882)
switched from RUNNING to CANCELING.
> {code}
> There are no other logged around that time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message