flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alexandrov <alexander.s.alexand...@gmail.com>
Subject Re: The slot in which the task was scheduled has been killed (probably loss of TaskManager)
Date Mon, 29 Jun 2015 11:08:22 GMT
I witnessed a similar issue yesterday on a simple job (single task chain,
no shuffles) with a release-0.9 based fork.

2015-04-15 14:59 GMT+02:00 Flavio Pompermaier <pompermaier@okkam.it>:

> Yes , sorry for that..I found it somewhere in the logs..the problem was
> that the program didn't die immediately but was somehow hanging and I
> discovered the source of the problem only running the program on a subset
> of the data.
>
> Thnks for the support,
> Flavio
>
> On Wed, Apr 15, 2015 at 2:56 PM, Stephan Ewen <sewen@apache.org> wrote:
>
>> This means that the TaskManager was lost. The JobManager can no longer
>> reach the TaskManager and consists all tasks executing ob the TaskManager
>> as failed.
>>
>> Have a look at the TaskManager log, it should describe why the
>> TaskManager failed.
>> Am 15.04.2015 14:45 schrieb "Flavio Pompermaier" <pompermaier@okkam.it>:
>>
>>> Hi to all,
>>>
>>> I have this strange error in my job and I don't know what's going on.
>>> What can I do?
>>>
>>> The full exception is:
>>>
>>> The slot in which the task was scheduled has been killed (probably loss
>>> of TaskManager).
>>> at
>>> org.apache.flink.runtime.instance.SimpleSlot.cancel(SimpleSlot.java:98)
>>> at
>>> org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSimpleSlot(SlotSharingGroupAssignment.java:335)
>>> at
>>> org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:319)
>>> at
>>> org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:106)
>>> at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:151)
>>> at
>>> org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:182)
>>> at
>>> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:435)
>>> at
>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>>> at
>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>>> at
>>> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>>> at
>>> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
>>> at
>>> org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
>>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>>> at
>>> org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>>> at
>>> org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94)
>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>>> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>
>>>
>

Mime
View raw message