flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Task managers cant start on YARN cluster
Date Mon, 14 Nov 2016 08:42:26 GMT
Good to know that you solved this. :) Do you think there is something we can do to help users
noticing this situation faster?

– Ufuk

On 13 November 2016 at 00:23:21, Gyula Fóra (gyula.fora@gmail.com) wrote:
> Hi,
>  
> What happened is that I compiled Flink with the wrong hadoop version...
>  
> Sorry :)
> Gyula
>  
> Gyula Fóra ezt írta (időpont: 2016. nov. 12., Szo,
> 13:11):
>  
> > Hi,
> >
> > I am running into some strange issues on yarn with Flink 1.1.3 & 4. For
> > some reason I started getting this error (see under text.)
> > The job manager starts and the application is in Accepted state but cannot
> > seem to be able to communicate with the scheduler. (0.0.0.0:8030 seems
> > strange)
> >
> > I didn't change anything on the yarn cluster and this seemed to work
> > previously (but I just cant get it to work now). The yarn-site.xml contains
> > the proper rm addresses.
> >
> > Anybody has any ideas where to go from here?
> >
> > Cheers,
> > Gyula
> >
> > JM log:
> >
> > 2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client - The ping interval 

> is 60000 ms.
> > 2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client - Connecting to /0.0.0.0:8030
 
> > 2016-11-12 11:56:06,899 DEBUG org.apache.hadoop.ipc.Client - closing ipc connection
 
> to 0.0.0.0/0.0.0.0:8030: Connection refused
> >
> > java.net.ConnectException: Call From splat24.sto.midasplayer.com/172.25.86.166 

> to 0.0.0.0:8030 failed on connection exception: java.net.ConnectException: Connection
 
> refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused  
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)  
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 
> > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 
> > at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1410)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1359)
> > at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 
> > at com.sun.proxy.$Proxy8.registerApplicationMaster(Unknown Source)
> > at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
 
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
> > at java.lang.reflect.Method.invoke(Method.java:497)
> > at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
 
> > at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 
> > at com.sun.proxy.$Proxy9.registerApplicationMaster(Unknown Source)
> > at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196)
 
> > at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138)
 
> > at org.apache.flink.yarn.YarnFlinkResourceManager.initialize(YarnFlinkResourceManager.java:259)
 
> > at org.apache.flink.runtime.clusterframework.FlinkResourceManager.preStart(FlinkResourceManager.java:185)
 
> > at akka.actor.Actor$class.aroundPreStart(Actor.scala:470)
> > at akka.actor.UntypedActor.aroundPreStart(UntypedActor.scala:97)
> > at akka.actor.ActorCell.create(ActorCell.scala:580)
> > at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456)
> > at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
> >
> >
> > Client:
> >
> > 2016-11-12 12:31:31,080 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli  
> - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor
 
> to locate the jar
> > 2016-11-12 12:31:31,080 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli  
> - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor
 
> to locate the jar
> > 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> Using values:
> > 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> TaskManager count = 1
> > 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> JobManager memory = 1024
> > 2016-11-12 12:31:31,102 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> TaskManager memory = 11000
> > 2016-11-12 12:31:31,119 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting
 
> to ResourceManager at /0.0.0.0:8032
> > 2016-11-12 12:31:31,394 WARN org.apache.flink.yarn.YarnClusterDescriptor -  
> The file system scheme is 'file'. This indicates that the specified Hadoop configuration
 
> path is wrong and the system is using the default Hadoop configuration values.The Flink
 
> YARN client needs to store its files in a distributed file system
> > 2016-11-12 12:31:31,457 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/conf/log4j.properties
 
> to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/log4j.properties
 
> > 2016-11-12 12:31:42,321 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/lib
 
> to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/lib  
> > 2016-11-12 12:32:18,457 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/rbea/rbea-on-flink-1.0-SNAPSHOT.jar
 
> to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/rbea-on-flink-1.0-SNAPSHOT.jar
 
> > 2016-11-12 12:32:39,725 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/lib/flink-dist_2.10-1.1.4.jar
 
> to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-dist_2.10-1.1.4.jar
 
> > 2016-11-12 12:32:58,154 INFO org.apache.flink.yarn.Utils - Copying from /fjord/sites/flink-1.1.3/conf/flink-conf.yaml
 
> to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-conf.yaml
 
> > 2016-11-12 12:33:02,218 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> Submitting application master application_1478896050022_0013
> > 2016-11-12 12:33:02,256 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl
 
> - Submitted application application_1478896050022_0013
> > 2016-11-12 12:33:02,257 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> Waiting for the cluster to be allocated
> > 2016-11-12 12:33:02,259 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> Deploying cluster, current state ACCEPTED
> > 2016-11-12 12:34:02,485 INFO org.apache.flink.yarn.YarnClusterDescriptor -  
> Deployment took more than 60 seconds. Please check if the requested resources are available
 
> in the YARN cluster
> >
> >
>  


Mime
View raw message