flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Task managers cant start on YARN cluster
Date Sat, 12 Nov 2016 12:11:22 GMT
Hi,

I am running into some strange issues on yarn with Flink 1.1.3 & 4. For
some reason I started getting this error (see under text.)
The job manager starts and the application is in Accepted state but cannot
seem to be able to communicate with the scheduler. (0.0.0.0:8030 seems
strange)

I didn't change anything on the yarn cluster and this seemed to work
previously (but I just cant get it to work now). The yarn-site.xml contains
the proper rm addresses.

Anybody has any ideas  where to go from here?

Cheers,
Gyula

JM log:

2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client
                     - The ping interval is 60000 ms.
2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client
                     - Connecting to /0.0.0.0:8030
2016-11-12 11:56:06,899 DEBUG org.apache.hadoop.ipc.Client
                     - closing ipc connection to 0.0.0.0/0.0.0.0:8030:
Connection refused

java.net.ConnectException: Call From
splat24.sto.midasplayer.com/172.25.86.166 to 0.0.0.0:8030 failed on
connection exception: java.net.ConnectException: Connection refused;
For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
	at org.apache.hadoop.ipc.Client.call(Client.java:1410)
	at org.apache.hadoop.ipc.Client.call(Client.java:1359)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
	at com.sun.proxy.$Proxy8.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy9.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196)
	at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138)
	at org.apache.flink.yarn.YarnFlinkResourceManager.initialize(YarnFlinkResourceManager.java:259)
	at org.apache.flink.runtime.clusterframework.FlinkResourceManager.preStart(FlinkResourceManager.java:185)
	at akka.actor.Actor$class.aroundPreStart(Actor.scala:470)
	at akka.actor.UntypedActor.aroundPreStart(UntypedActor.scala:97)
	at akka.actor.ActorCell.create(ActorCell.scala:580)
	at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456)
	at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)


Client:

2016-11-12 12:31:31,080 INFO
org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No
path for the flink jar passed. Using the location of class
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2016-11-12 12:31:31,080 INFO
org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No
path for the flink jar passed. Using the location of class
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2016-11-12 12:31:31,101 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   - Using
values:
2016-11-12 12:31:31,101 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
	TaskManager count = 1
2016-11-12 12:31:31,101 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
	JobManager memory = 1024
2016-11-12 12:31:31,102 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
	TaskManager memory = 11000
2016-11-12 12:31:31,119 INFO  org.apache.hadoop.yarn.client.RMProxy
                     - Connecting to ResourceManager at /0.0.0.0:8032
2016-11-12 12:31:31,394 WARN
org.apache.flink.yarn.YarnClusterDescriptor                   - The
file system scheme is 'file'. This indicates that the specified Hadoop
configuration path is wrong and the system is using the default Hadoop
configuration values.The Flink YARN client needs to store its files in
a distributed file system
2016-11-12 12:31:31,457 INFO  org.apache.flink.yarn.Utils
                     - Copying from
file:/fjord/sites/flink-1.1.3/conf/log4j.properties to
file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/log4j.properties
2016-11-12 12:31:42,321 INFO  org.apache.flink.yarn.Utils
                     - Copying from file:/fjord/sites/flink-1.1.3/lib
to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/lib
2016-11-12 12:32:18,457 INFO  org.apache.flink.yarn.Utils
                     - Copying from
file:/fjord/sites/rbea/rbea-on-flink-1.0-SNAPSHOT.jar to
file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/rbea-on-flink-1.0-SNAPSHOT.jar
2016-11-12 12:32:39,725 INFO  org.apache.flink.yarn.Utils
                     - Copying from
file:/fjord/sites/flink-1.1.3/lib/flink-dist_2.10-1.1.4.jar to
file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-dist_2.10-1.1.4.jar
2016-11-12 12:32:58,154 INFO  org.apache.flink.yarn.Utils
                     - Copying from
/fjord/sites/flink-1.1.3/conf/flink-conf.yaml to
file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-conf.yaml
2016-11-12 12:33:02,218 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
Submitting application master application_1478896050022_0013
2016-11-12 12:33:02,256 INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         -
Submitted application application_1478896050022_0013
2016-11-12 12:33:02,257 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
Waiting for the cluster to be allocated
2016-11-12 12:33:02,259 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
Deploying cluster, current state ACCEPTED
2016-11-12 12:34:02,485 INFO
org.apache.flink.yarn.YarnClusterDescriptor                   -
Deployment took more than 60 seconds. Please check if the requested
resources are available in the YARN cluster

Mime
View raw message