flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-3078) JobManager does not shutdown when checkpointed jobs are running
Date Wed, 25 Nov 2015 13:30:11 GMT
Robert Metzger created FLINK-3078:
-------------------------------------

             Summary: JobManager does not shutdown when checkpointed jobs are running
                 Key: FLINK-3078
                 URL: https://issues.apache.org/jira/browse/FLINK-3078
             Project: Flink
          Issue Type: Bug
          Components: JobManager
    Affects Versions: 0.10.0, 0.10.1
            Reporter: Robert Metzger


While testing the 0.10.1 release, I found that the JobManager does not shutdown when I'm stopping
it while a streaming job is running.

It seems that the checkpoint coordinator and the execution graph are still logging, even though
the JobManager actor system and other services have been shut down.

This is a log file of an affected JobManager: https://gist.github.com/rmetzger/a1532c18eb7081977cee

{code}
11:58:04,406 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Flat Map
-> Sink: Unnamed (10/10) (c6544ca6d88e2d1acdec5c838d5fce06) switched from CANCELING to
FAILED
11:58:04,406 DEBUG org.apache.flink.runtime.executiongraph.ExecutionGraph        - Kafka Consumer
Topology switched from FAILING to RESTARTING.
11:58:04,407 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Delaying
retry of job execution for 100000 ms ...
11:58:04,417 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped
BLOB server at 0.0.0.0:44904
11:58:04,421 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting
down remote daemon.
11:58:04,422 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote
daemon shut down; proceeding with flushing remote transports.
11:58:04,446 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting
shut down.
11:58:04,473 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor         - Removing
web root dir /tmp/flink-web-2039bed3-d9f9-4950-83ab-6fb70f7fc302
11:58:04,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
checkpoint 66 @ 1448452684590
11:58:04,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting
checkpoint.
11:58:05,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
checkpoint 67 @ 1448452685091
11:58:05,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting
checkpoint.
11:58:05,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
checkpoint 68 @ 1448452685590
11:58:05,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting
checkpoint.
11:58:06,090 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
checkpoint 69 @ 1448452686090
11:58:06,091 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
triggering task Source: Custom Source (1/10) is not being executed at the moment. Aborting
checkpoint.
11:58:06,590 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
checkpoint 70 @ 1448452686590
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message