flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pieter Hameete <phame...@gmail.com>
Subject Re: Flink on YARN: Stuck on "Trying to register at JobManager"
Date Sat, 06 Feb 2016 13:11:38 GMT
Hi Max!

I'm using Flink 0.10.1 and indeed the cluster seems to be created fine, all
in the JobManager Web UI looks good.

It seems like the JobManager initiates the connection with my VM and cannot
reach it. It could be that this is similar to the problem here:

http://apache-spark-user-list.1001560.n3.nabble.com/spark-with-docker-errors-with-akka-NAT-td7702.html

I probably have to make some changes to the networking configuration of my
VM so it can be reached by the JobManager despite using a different port
each time.

- Pieter

2016-02-06 14:05 GMT+01:00 Maximilian Michels <mxm@apache.org>:

> Hi Pieter,
>
> Which version of Flink are you using? It appears you've created a
> Flink YARN cluster but you can't reach the JobManager afterwards.
>
> Cheers,
> Max
>
> On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete <phameete@gmail.com> wrote:
> > Hi Robert,
> >
> > unfortunately there are no signs of what is going wrong in the logs. The
> > last log messages are about succesful registration of the TaskManagers.
> >
> > I'm also fairly sure it must be something in my VM that is causing this,
> > because when I start the yarn-session from a login node that is on the
> same
> > network as the hadoop cluster there are no problems registering with the
> > JobManager. I did also notice the following message in the local console:
> >
> > 12:30:27,173 WARN  Remoting
> > - Tried to associate with unreachable remote address
> > [akka.tcp://flink@145.100.41.13:41539]. Address is now gated for 5000
> ms,
> > all messages to this address will be delivered to dead letters. Reason:
> > connection timed out: /145.100.41.13:41539
> >
> > I can ping the JobManager fine from with VM. Could there be some invalid
> or
> > missing configuration on my side?
> >
> > Cheers,
> >
> > Pieter
> >
> >
> > 2016-02-06 12:54 GMT+01:00 Robert Metzger <rmetzger@apache.org>:
> >>
> >> Hi,
> >>
> >> did you check the logs of the JobManager itself? Maybe it'll tell us
> >> already whats going on.
> >>
> >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete <phameete@gmail.com>
> >> wrote:
> >>>
> >>> Hi Guys!
> >>>
> >>> Im attempting to run Flink on YARN, but I run into an issue. Im
> starting
> >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well until
> after
> >>> the JobManager web UI is started:
> >>>
> >>> JobManager web interface address
> >>>
> http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/
> >>> Waiting until all TaskManagers have connected
> >>> 11:09:51,557 INFO  org.apache.flink.yarn.ApplicationClient
> >>> - Notification about new leader address
> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with session ID
> null.
> >>> No status updates from the YARN cluster received so far. Waiting ...
> >>> 11:09:51,578 INFO  org.apache.flink.yarn.ApplicationClient
> >>> - Received address of new leader
> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with session ID
> null.
> >>> 11:09:51,583 INFO  org.apache.flink.yarn.ApplicationClient
> >>> - Disconnect from JobManager null.
> >>> 11:09:51,595 INFO  org.apache.flink.yarn.ApplicationClient
> >>> - Trying to register at JobManager
> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager.
> >>> No status updates from the YARN cluster received so far. Waiting ...
> >>> No status updates from the YARN cluster received so far. Waiting ...
> >>>
> >>> It then hangs on these last steps (trying to register, no status
> >>> updates..)
> >>>
> >>> Im sure there must be a problem on my side that is causing me not to be
> >>> able to register at the JobManager. What could cause such connection
> >>> problems?
> >>>
> >>> Any tips are very welcome :-)
> >>>
> >>> Cheers and have a good weekend!
> >>>
> >>> - Pieter
> >>>
> >>>
> >>
> >
>

Mime
View raw message