flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: job manager timeout
Date Thu, 11 Feb 2016 10:04:38 GMT
Hi Radu,

did you check the JobManager logs as well? Maybe there you can see why the
JobManager is failing.

The timeout is configurable through the "akka.client.timeout" variable. The
default value is "60 s".

On Wed, Feb 10, 2016 at 7:35 PM, Radu Tudoran <radu.tudoran@huawei.com>
wrote:

> Hi,
>
>
>
> I am running a program that works fine locally, but when I try to run it
> on the cluster I get a timeout error from the client that tries to connect
> to the jobmanager. There is no issue with contacting the jobmanager form
> the machine, as it works just fine for other stream applications. I suspect
> that because the stream topology is rather complex, there is an issue with
> deploying the schematic. I am not sure if this is a normal behavior (IMHO I
> would think it should not fail just because the topology is more complex).
> Hence, if the error helps to identify the underlyin issue (if any) please
> see it below.
>
> Meanwhile, can you please educate me on how I can configure the timeout
> such that it won’t fail anymore.
>
>
>
> Thanks
>
>
>
>
>
>
>
> org.apache.flink.client.program.ProgramInvocationException: The program
> execution failed: Communication with JobManager failed: Job submission to
> the JobManager timed out.
>
>         at
> org.apache.flink.client.program.Client.runBlocking(Client.java:370)
>
>         at
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:96)
>
>         at application.MainStreamApp.main(MainStreamApp.java:108)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:606)
>
>         at
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497)
>
>         at
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395)
>
>         at
> org.apache.flink.client.program.Client.runBlocking(Client.java:252)
>
>         at
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:676)
>
>         at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326)
>
>         at
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:978)
>
>         at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1028)
>
> Caused by: org.apache.flink.runtime.client.JobExecutionException:
> Communication with JobManager failed: Job submission to the JobManager
> timed out.
>
>         at
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:140)
>
>         at
> org.apache.flink.client.program.Client.runBlocking(Client.java:368)
>
>         ... 13 more
>
> Caused by:
> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException:
> Job submission to the JobManager timed out.
>
>         at
> org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:255)
>
>         at
> org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:88)
>
>         at
> org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
>
>         at
> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
>
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>
>         at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
>
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>
>         at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>
>         at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>
>        at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>
>         at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>
>         at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>
>         at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>
>         at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
>
>
>
>
>
> Dr. Radu Tudoran
>
> Research Engineer - Big Data Expert
>
> IT R&D Division
>
>
>
> [image: cid:image007.jpg@01CD52EB.AD060EE0]
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
>
> European Research Center
>
> Riesstrasse 25, 80992 München
>
>
>
> E-mail: *radu.tudoran@huawei.com <radu.tudoran@huawei.com>*
>
> Mobile: +49 15209084330
>
> Telephone: +49 891588344173
>
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
> Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
> Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>

Mime
View raw message