flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Baghino <stefano.bagh...@radicalbit.io>
Subject Master (1.1-SNAPSHOT) Can't run on YARN
Date Tue, 19 Apr 2016 16:31:32 GMT
Hi everyone,

I'm currently experiencing a weird situation, I hope you can help me out
with this.

I've cloned and built from the master, then I've edited the default config
fil by adding my Hadoop config path, exported the HADOOP_CONF_DIR env var
and ran bin/yarn-session.sh -n 1 -s 2 -jm 2048 -tm 2048

The first thing I noticed is that I had to put "-s 2" or the task managers
gets created with -1 slots (!) by default.

After putting "-s 2" the YARN session startup hangs when trying to register
the task managers. I've stopped the session and aggregated the logs and
read a lot (several thousands) of the messages I attach at the bottom; any
idea of what this may be?

Thank you a lot in advance!

2016-04-19 12:15:59,507 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 1, timeout: 500
milliseconds)

2016-04-19 12:15:59,649 ERROR org.apache.flink.yarn.YarnTaskManager
                - The registration at JobManager Some(akka.tcp://
flink@172.31.20.101:57379/user/jobmanager) was refused, because:
java.lang.IllegalStateException: Resource
ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not
registered with resource manager.. Retrying later...

2016-04-19 12:16:00,025 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 2, timeout: 1000
milliseconds)

2016-04-19 12:16:00,033 ERROR org.apache.flink.yarn.YarnTaskManager
                - The registration at JobManager Some(akka.tcp://
flink@172.31.20.101:57379/user/jobmanager) was refused, because:
java.lang.IllegalStateException: Resource
ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not
registered with resource manager.. Retrying later...

2016-04-19 12:16:01,045 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 3, timeout: 2000
milliseconds)

2016-04-19 12:16:01,053 ERROR org.apache.flink.yarn.YarnTaskManager
                - The registration at JobManager Some(akka.tcp://
flink@172.31.20.101:57379/user/jobmanager) was refused, because:
java.lang.IllegalStateException: Resource
ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not
registered with resource manager.. Retrying later...

2016-04-19 12:16:03,064 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 4, timeout: 4000
milliseconds)

2016-04-19 12:16:03,072 ERROR org.apache.flink.yarn.YarnTaskManager
                - The registration at JobManager Some(akka.tcp://
flink@172.31.20.101:57379/user/jobmanager) was refused, because:
java.lang.IllegalStateException: Resource
ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not
registered with resource manager.. Retrying later...

2016-04-19 12:16:07,085 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 5, timeout: 8000
milliseconds)

2016-04-19 12:16:07,092 ERROR org.apache.flink.yarn.YarnTaskManager
                - The registration at JobManager Some(akka.tcp://
flink@172.31.20.101:57379/user/jobmanager) was refused, because:
java.lang.IllegalStateException: Resource
ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not
registered with resource manager.. Retrying later...

2016-04-19 12:16:09,664 INFO  org.apache.flink.yarn.YarnTaskManager
                - Trying to register at JobManager akka.tcp://
flink@172.31.20.101:57379/user/jobmanager (attempt 1, timeout: 500
milliseconds)

-- 
BR,
Stefano Baghino

Software Engineer @ Radicalbit

Mime
View raw message