flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-230) Job Cancellation does not work properly: "Cannot find execution graph to job ID"
Date Sun, 21 Sep 2014 02:20:34 GMT

     [ https://issues.apache.org/jira/browse/FLINK-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stephan Ewen resolved FLINK-230.
--------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: pre-apache)
                   0.7-incubating
         Assignee: Stephan Ewen

Fixed in ae139f5ae2199a52e8d7f561f94db51631107d00

> Job Cancellation does not work properly: "Cannot find execution graph to job ID"
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-230
>                 URL: https://issues.apache.org/jira/browse/FLINK-230
>             Project: Flink
>          Issue Type: Bug
>            Reporter: GitHub Import
>            Assignee: Stephan Ewen
>              Labels: github-import
>             Fix For: 0.7-incubating
>
>
> Hi,
> I noticed this error message on a failing Job.
> ```
> 12:37:10,697 INFO  eu.stratosphere.nephele.execution.ExecutionStateTransition    - JM:
ExecutionState set from CANCELING to CANCELED for task Invoices file (7/8)
> 12:37:10,697 INFO  eu.stratosphere.nephele.execution.ExecutionStateTransition    - JM:
ExecutionState set from CANCELING to CANCELED for task ([#2|https://github.com/stratosphere/stratosphere/issues/2]
| [FLINK-2|https://issues.apache.org/jira/browse/FLINK-2]) filter invoices: month <= 12
(8/8)
> 12:37:10,697 INFO  eu.stratosphere.nephele.jobmanager.scheduler.AbstractScheduler  -
Releasing instance hadoop02
> 12:37:10,699 INFO  eu.stratosphere.nephele.jobmanager.JobManager                 - Status
of job XX 0b7407b5ad73a40043c36c16baacf400) changed to FAILED
> 12:37:10,706 ERROR eu.stratosphere.nephele.jobmanager.JobManager                 - Cannot
find execution graph to job ID 0b7407b5ad73a40043c36c16baacf400
> 12:37:10,706 ERROR eu.stratosphere.nephele.jobmanager.JobManager                 - Cannot
find execution graph to job ID 0b7407b5ad73a40043c36c16baacf400
> 12:37:10,709 ERROR eu.stratosphere.nephele.jobmanager.JobManager                 - Cannot
find execution graph to job ID 0b7407b5ad73a40043c36c16baacf400
> ```
> The errors occurs quite often:
> ```
> rmetzger@hadoop01:~/log$ cat nephele-rmetzger-jobmanager-hadoop01.log | grep "Cannot
find"  | wc -l
> 21262
> ```
> The TaskManager also reports errors:
> ```
> 12:37:14,951 ERROR eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager
 - Cannot find task(s) waiting for data from source channel with ID 43930c029c759c003792e4dfd4411800
> 12:37:14,952 ERROR eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager
 - Cannot find task(s) waiting for data from source channel with ID 0937e32b635954000efb7f68c0c80c00
> 12:37:14,953 ERROR eu.stratosphere.nephele.taskmanager.bytebuffered.ByteBufferedChannelManager
 - Cannot find task(s) waiting for data from source channel with ID 6922a9feb031540009dbe583b3fbe800
> ```
> ```
> rmetzger@hadoop01:~/log$ cat nephele-rmetzger-taskmanager-hadoop01.log | grep "for data
from source channel" | wc -l 6612
> ```
> I also saw this
> ```
> 12:00:00,221 ERROR eu.stratosphere.nephele.execution.ExecutionStateTransition    - java.lang.IllegalStateException:
Unexpected state change: CANCELING -> FAILED
>         at eu.stratosphere.nephele.execution.ExecutionStateTransition.checkTransition(ExecutionStateTransition.java:167)
>         at eu.stratosphere.nephele.executiongraph.ExecutionVertex.updateExecutionState(ExecutionVertex.java:384)
>         at eu.stratosphere.nephele.executiongraph.ExecutionVertex$1.run(ExecutionVertex.java:319)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> ```
> I did not see this behavior before, so it could be new (I did not do any major changes
on the job)
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/230
> Created by: [rmetzger|https://github.com/rmetzger]
> Labels: bug, runtime, 
> Created at: Fri Nov 01 15:02:46 CET 2013
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message