flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radu Tudoran <radu.tudo...@huawei.com>
Subject RE: lost connection
Date Thu, 21 Apr 2016 14:30:45 GMT
Yes - it suddenly occurred on something that used to work. I am restarting the deployment to
see if this solves the problem

Dr. Radu Tudoran
Research Engineer - Big Data Expert
IT R&D Division

[cid:image007.jpg@01CD52EB.AD060EE0]
HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: radu.tudoran@huawei.com
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com<http://www.huawei.com/>
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, which is intended
only for the person or entity whose address is listed above. Any use of the information contained
herein in any way (including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive
this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Chesnay Schepler [mailto:chesnay@apache.org]
Sent: Thursday, April 21, 2016 4:26 PM
To: user@flink.apache.org
Subject: Re: lost connection

That is an exempt from the client log, can you check the JobManager log? It could have crashed,
and if so the cause is hopefully in there.

Did this issue suddenly occur; as in have you run a job successfully on the system before?
(to exclude network configuration issues)

Regards,
Chesnay

On 21.04.2016 16:09, Radu Tudoran wrote:
- Could not submit job Operator2 execution (170aef70d31f3fee62f8a483930be213), because there
is no connection to a JobManager.
15:59:48,456 WARN  Remoting                                                      - Tried to
associate with unreachable remote address [akka.tcp://flink@10.204.62.71:6123<mailto:akka.tcp://flink@10.204.62.71:6123>].
Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters.
Reason: Connection refused: /10.204.62.71:6123
16:01:28,409 ERROR org.apache.flink.client.CliFrontend                           - Error while
running the command.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed:
Communication with JobManager failed: Lost connection to the JobManager.

I do not understand what could be the root cause of this... the IPs look ok and there is not
firewall to block things...

Dr. Radu Tudoran
Research Engineer - Big Data Expert
IT R&D Division

[cid:image007.jpg@01CD52EB.AD060EE0]
HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: radu.tudoran@huawei.com<mailto:radu.tudoran@huawei.com>
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com<http://www.huawei.com/>
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, which is intended
only for the person or entity whose address is listed above. Any use of the information contained
herein in any way (including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive
this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Chesnay Schepler [mailto:chesnay@apache.org]
Sent: Thursday, April 21, 2016 3:58 PM
To: user@flink.apache.org<mailto:user@flink.apache.org>
Subject: Re: lost connection

Hello,

the first step is always to check the logs under /log. The JobManager log in particular may
contain clues as why no connection could be established.

Regards,
Chesnay

On 21.04.2016 15:44, Radu Tudoran wrote:
Hi,

I am trying to submit a jar via the console (flink run my.jar). The result is that I get an
error saying that the communication with the jobmanager failed: Lost connection to the jobmanager.
Can you give me some hints/ recommendations about approaching this issue.

Thanks

Dr. Radu Tudoran
Research Engineer - Big Data Expert
IT R&D Division

[cid:image007.jpg@01CD52EB.AD060EE0]
HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: radu.tudoran@huawei.com<mailto:radu.tudoran@huawei.com>
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com<http://www.huawei.com/>
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, which is intended
only for the person or entity whose address is listed above. Any use of the information contained
herein in any way (including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive
this e-mail in error, please notify the sender by phone or email immediately and delete it!




Mime
View raw message