Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7276F200C1E for ; Fri, 17 Feb 2017 11:33:30 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 70D65160B55; Fri, 17 Feb 2017 10:33:30 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 61FF0160B3F for ; Fri, 17 Feb 2017 11:33:27 +0100 (CET) Received: (qmail 36152 invoked by uid 500); 17 Feb 2017 10:33:26 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 36143 invoked by uid 99); 17 Feb 2017 10:33:26 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Feb 2017 10:33:26 +0000 Received: from mail-yw0-f181.google.com (mail-yw0-f181.google.com [209.85.161.181]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id C98711A00A2 for ; Fri, 17 Feb 2017 10:33:25 +0000 (UTC) Received: by mail-yw0-f181.google.com with SMTP id u68so21210684ywg.0 for ; Fri, 17 Feb 2017 02:33:25 -0800 (PST) X-Gm-Message-State: AMke39mKtxswHcxgEzgu/jKiUvaTYBkxjRiUauLkZU3J/5tbeS4TEhXRT+2l5e3isQRgfQG1GE5bnyfp+/8lhw== X-Received: by 10.129.89.70 with SMTP id n67mr5193275ywb.296.1487327604663; Fri, 17 Feb 2017 02:33:24 -0800 (PST) MIME-Version: 1.0 Received: by 10.129.91.133 with HTTP; Fri, 17 Feb 2017 02:32:44 -0800 (PST) In-Reply-To: <6be188a38ca54c908839d622edc88b34@SH1MAIL02.corp.vipshop.com> References: <6be188a38ca54c908839d622edc88b34@SH1MAIL02.corp.vipshop.com> From: Till Rohrmann Date: Fri, 17 Feb 2017 11:32:44 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Can't run flink on yarn on version 1.2.0 To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a11470abc8a74c20548b7711a archived-at: Fri, 17 Feb 2017 10:33:30 -0000 --001a11470abc8a74c20548b7711a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Howard, could you check whether the JobManager's actor system was bound to " vip-rc-vsubu.vclound.com:55926"? You should see that in the job manager logs. Furthermore, have you checked that you Yarn cluster nodes are actually reachable from the node where you start the Flink application? If so, the logs of the cli client as well as the JobManager logs (both on debug level) would be tremendously helpful. Cheers, Till On Fri, Feb 17, 2017 at 10:41 AM, Howard,Li(vip.com) wrote: > Sorry for the confusion I made. I do copy the wrong log, but we do meet > this problem on 1.2.0. > > for version 1.1.4 however, we meet this in one cluster but not in another= . > We are still trying to figure out what happened. > > > > The following is the log for 1.2.0 version: > > > > 2017-02-17 15:51:37,775 INFO org.apache.flink.yarn.cli. > FlinkYarnSessionCli - No path for the flink jar passed. > Using the location of class org.apache.flink.yarn.YarnClusterDescriptor > to locate the jar > > 2017-02-17 15:51:37,775 INFO org.apache.flink.yarn.cli. > FlinkYarnSessionCli - No path for the flink jar passed. > Using the location of class org.apache.flink.yarn.YarnClusterDescriptor > to locate the jar > > 2017-02-17 15:51:37,803 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Using values: > > 2017-02-17 15:51:37,804 INFO org.apache.flink.yarn. > YarnClusterDescriptor - TaskManager count =3D 2 > > 2017-02-17 15:51:37,804 INFO org.apache.flink.yarn. > YarnClusterDescriptor - JobManager memory =3D 1024 > > 2017-02-17 15:51:37,804 INFO org.apache.flink.yarn. > YarnClusterDescriptor - TaskManager memory =3D 1024 > > 2017-02-17 15:51:37,827 INFO org.apache.hadoop.yarn.client. > RMProxy - Connecting to ResourceManager at / > 0.0.0.0:8032 > > 2017-02-17 15:51:38,672 WARN org.apache.flink.yarn. > YarnClusterDescriptor - The configuration directory > ('/home/software/flink-1.2.0/conf') contains both LOG4J and Logback > configuration files. Please delete or rename one of them. > > 2017-02-17 15:51:38,685 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.2.0/examples/batch/WordCount.jar to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0016/WordCount.jar > > 2017-02-17 15:51:38,992 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.2.0/conf/log4j.properties to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_1487247313588_0016/log4j= . > properties > > 2017-02-17 15:51:39,058 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.2.0/conf/logback.xml to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0016/logback.xml > > 2017-02-17 15:51:39,085 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.2.0/lib to hdfs://10.199.202.161:9000/ > user/root/.flink/application_1487247313588_0016/lib > > 2017-02-17 15:51:39,695 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0016/flink-dist_2.11-1.2.0.jar > > 2017-02-17 15:51:40,493 INFO org.apache.flink.yarn.Utils > - Copying from /home/software/flink-1.2.0= /conf/flink-conf.yaml > to hdfs://10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0016/flink-conf.yaml > > 2017-02-17 15:51:40,547 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Submitting application master > application_1487247313588_0016 > > 2017-02-17 15:51:40,585 INFO org.apache.hadoop.yarn.client. > api.impl.YarnClientImpl - Submitted application > application_1487247313588_0016 > > 2017-02-17 15:51:40,585 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Waiting for the cluster to be > allocated > > 2017-02-17 15:51:40,587 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Deploying cluster, current > state ACCEPTED > > 2017-02-17 15:51:45,879 INFO org.apache.flink.yarn. > YarnClusterDescriptor - YARN application has been > deployed successfully. > > Cluster started: Yarn cluster with application id > application_1487247313588_0016 > > Using address vip-rc-vsubu.vclound.com:55926 to connect to JobManager. > > JobManager web interface address http://vip-rc-ucsww.vclound. > com:8088/proxy/application_1487247313588_0016/ > > Using the parallelism provided by the remote cluster (8). To use another > parallelism, set it at the ./bin/flink client. > > Starting execution of program > > 2017-02-17 15:51:46,704 INFO org.apache.flink.yarn. > YarnClusterClient - Starting program in interactive > mode > > Executing WordCount example with default input data set. > > Use --input to specify file input. > > Printing result to stdout. Use --output to specify output path. > > 2017-02-17 15:51:47,029 INFO org.apache.flink.yarn. > YarnClusterClient - Waiting until all TaskManagers > have connected > > Waiting until all TaskManagers have connected > > 2017-02-17 15:51:47,029 INFO org.apache.flink.yarn. > YarnClusterClient - Starting client actor system. > > > > ------------------------------------------------------------ > > The program finished with the following exception: > > > > org.apache.flink.client.program.ProgramInvocationException: The main > method caused an error. > > at org.apache.flink.client.program.PackagedProgram. > callMainMethod(PackagedProgram.java:545) > > at org.apache.flink.client.program.PackagedProgram. > invokeInteractiveModeForExecution(PackagedProgram.java:419) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:339) > > at org.apache.flink.client.CliFrontend.executeProgram( > CliFrontend.java:831) > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256) > > at org.apache.flink.client.CliFrontend.parseParameters( > CliFrontend.java:1073) > > at org.apache.flink.client.CliFrontend$2.call( > CliFrontend.java:1120) > > at org.apache.flink.client.CliFrontend$2.call( > CliFrontend.java:1117) > > at org.apache.flink.runtime.security.HadoopSecurityContext$1.run= ( > HadoopSecurityContext.java:43) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1657) > > at org.apache.flink.runtime.security.HadoopSecurityContext. > runSecured(HadoopSecurityContext.java:40) > > at org.apache.flink.client.CliFrontend.main(CliFrontend. > java:1116) > > Caused by: java.lang.RuntimeException: Unable to get ClusterClient status > from Application Client > > at org.apache.flink.yarn.YarnClusterClient.getClusterStatus( > YarnClusterClient.java:248) > > at org.apache.flink.yarn.YarnClusterClient. > waitForClusterToBeReady(YarnClusterClient.java:520) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:412) > > at org.apache.flink.yarn.YarnClusterClient.submitJob( > YarnClusterClient.java:210) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:400) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:387) > > at org.apache.flink.client.program.ContextEnvironment. > execute(ContextEnvironment.java:62) > > at org.apache.flink.api.java.ExecutionEnvironment.execute( > ExecutionEnvironment.java:926) > > at org.apache.flink.api.java.DataSet.collect(DataSet.java:410) > > at org.apache.flink.api.java.DataSet.print(DataSet.java:1605) > > at org.apache.flink.examples.java.wordcount.WordCount.main( > WordCount.java:92) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.flink.client.program.PackagedProgram. > callMainMethod(PackagedProgram.java:528) > > ... 13 more > > Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalExcept= ion: > Could not retrieve the leader gateway > > at org.apache.flink.runtime.util.LeaderRetrievalUtils. > retrieveLeaderGateway(LeaderRetrievalUtils.java:141) > > at org.apache.flink.client.program.ClusterClient. > getJobManagerGateway(ClusterClient.java:691) > > at org.apache.flink.yarn.YarnClusterClient.getClusterStatus( > YarnClusterClient.java:242) > > ... 28 more > > Caused by: java.util.concurrent.TimeoutException: Futures timed out after > [10000 milliseconds] > > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise. > scala:219) > > at scala.concurrent.impl.Promise$DefaultPromise.result(Promise. > scala:223) > > at scala.concurrent.Await$$anonfun$result$1.apply( > package.scala:190) > > at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn( > BlockContext.scala:53) > > at scala.concurrent.Await$.result(package.scala:190) > > at scala.concurrent.Await.result(package.scala) > > at org.apache.flink.runtime.util.LeaderRetrievalUtils. > retrieveLeaderGateway(LeaderRetrievalUtils.java:139) > > ... 30 more > > 2017-02-17 15:52:21,145 INFO org.apache.flink.yarn. > YarnClusterClient - Sending shutdown request to the > Application Master > > 2017-02-17 15:52:21,145 INFO org.apache.flink.yarn. > YarnClusterClient - Start application client. > > 2017-02-17 15:52:21,151 WARN org.apache.flink.yarn. > YarnClusterClient - YARN reported application state > FAILED > > 2017-02-17 15:52:21,152 WARN org.apache.flink.yarn. > YarnClusterClient - Diagnostics: Application > application_1487247313588_0016 failed 1 times due to AM Container for > appattempt_1487247313588_0016_000001 exited with exitCode: -103 > > For more detailed output, check application tracking page: > http://vip-rc-ucsww.vclound.com:8088/cluster/app/ > application_1487247313588_0016Then, click on links to logs of each > attempt. > > Diagnostics: Container [pid=3D18590,containerID=3D > container_1487247313588_0016_01_000001] is running beyond virtual memory > limits. Current usage: 266.1 MB of 1 GB physical memory used; 2.2 GB of 2= .1 > GB virtual memory used. Killing container. > > Dump of the process-tree for container_1487247313588_0016_01_000001 : > > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > > |- 18598 18590 18590 18590 (java) 894 48 2294116352 > <(229)%20411-6352> 67782 /home/software/jdk1.8.0_111/bin/java -Xmx424M > -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/ > application_1487247313588_0016/container_1487247313588_0016_01_000001/job= manager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > > |- 18590 18588 18590 18590 (bash) 0 0 108605440 334 /bin/bash -c > /home/software/jdk1.8.0_111/bin/java -Xmx424M -Dlog.file=3D/home/softwar= e/ > hadoop-2.7.3/logs/userlogs/application_1487247313588_ > 0016/container_1487247313588_0016_01_000001/jobmanager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > 1>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.out > 2>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.err > > > > Container killed on request. Exit code is 143 > > Container exited with a non-zero exit code 143 > > Failing this attempt. Failing the application. > > 2017-02-17 15:52:21,160 INFO org.apache.flink.yarn. > ApplicationClient - Notification about new leader > address akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanager > with session ID null. > > 2017-02-17 15:52:21,163 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:21,164 INFO org.apache.flink.yarn. > ApplicationClient - Received address of new leader > akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanager with > session ID null. > > 2017-02-17 15:52:21,165 INFO org.apache.flink.yarn. > ApplicationClient - Disconnect from JobManager null= . > > 2017-02-17 15:52:21,168 INFO org.apache.flink.yarn.ApplicationClient > - Trying to register at JobManager akka.tcp:// > flink@vip-rc-vsubu.vclound.com:55926/user/jobmanager. > > 2017-02-17 15:52:21,684 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanage= r > . > > 2017-02-17 15:52:22,174 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:22,704 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanage= r > . > > 2017-02-17 15:52:23,194 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:24,214 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:24,725 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanage= r > . > > 2017-02-17 15:52:25,234 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:26,254 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:27,274 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:28,294 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:28,744 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@vip-rc-vsubu.vclound.com:55926/user/jobmanage= r > . > > 2017-02-17 15:52:29,314 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:30,334 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:31,155 WARN org.apache.flink.yarn. > YarnClusterClient - Error while stopping YARN > cluster. > > java.util.concurrent.TimeoutException: Futures timed out after [10000 > milliseconds] > > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise. > scala:219) > > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise. > scala:153) > > at scala.concurrent.Await$$anonfun$ready$1.apply(package. > scala:169) > > at scala.concurrent.Await$$anonfun$ready$1.apply(package. > scala:169) > > at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn( > BlockContext.scala:53) > > at scala.concurrent.Await$.ready(package.scala:169) > > at scala.concurrent.Await.ready(package.scala) > > at org.apache.flink.yarn.YarnClusterClient.shutdownCluster( > YarnClusterClient.java:372) > > at org.apache.flink.yarn.YarnClusterClient.finalizeCluster( > YarnClusterClient.java:342) > > at org.apache.flink.client.program.ClusterClient. > shutdown(ClusterClient.java:208) > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:263) > > at org.apache.flink.client.CliFrontend.parseParameters( > CliFrontend.java:1073) > > at org.apache.flink.client.CliFrontend$2.call( > CliFrontend.java:1120) > > at org.apache.flink.client.CliFrontend$2.call( > CliFrontend.java:1117) > > at org.apache.flink.runtime.security.HadoopSecurityContext$1.run= ( > HadoopSecurityContext.java:43) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1657) > > at org.apache.flink.runtime.security.HadoopSecurityContext. > runSecured(HadoopSecurityContext.java:40) > > at org.apache.flink.client.CliFrontend.main(CliFrontend. > java:1116) > > 2017-02-17 15:52:31,156 INFO org.apache.flink.yarn. > YarnClusterClient - Deleting files in hdfs:// > 10.199.202.161:9000/user/root/.flink/application_1487247313588_0016 > > 2017-02-17 15:52:31,354 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:52:32,163 INFO org.apache.flink.yarn. > YarnClusterClient - Application > application_1487247313588_0016 finished with state FAILED and final state > FAILED at 1487317906227 > > 2017-02-17 15:52:32,163 WARN org.apache.flink.yarn. > YarnClusterClient - Application failed. Diagnostics > Application application_1487247313588_0016 failed 1 times due to AM > Container for appattempt_1487247313588_0016_000001 exited with exitCode: > -103 > > For more detailed output, check application tracking page: > http://vip-rc-ucsww.vclound.com:8088/cluster/app/ > application_1487247313588_0016Then, click on links to logs of each > attempt. > > Diagnostics: Container [pid=3D18590,containerID=3D > container_1487247313588_0016_01_000001] is running beyond virtual memory > limits. Current usage: 266.1 MB of 1 GB physical memory used; 2.2 GB of 2= .1 > GB virtual memory used. Killing container. > > Dump of the process-tree for container_1487247313588_0016_01_000001 : > > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > > |- 18598 18590 18590 18590 (java) 894 48 2294116352 > <(229)%20411-6352> 67782 /home/software/jdk1.8.0_111/bin/java -Xmx424M > -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/ > application_1487247313588_0016/container_1487247313588_0016_01_000001/job= manager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > > |- 18590 18588 18590 18590 (bash) 0 0 108605440 334 /bin/bash -c > /home/software/jdk1.8.0_111/bin/java -Xmx424M -Dlog.file=3D/home/softwar= e/ > hadoop-2.7.3/logs/userlogs/application_1487247313588_ > 0016/container_1487247313588_0016_01_000001/jobmanager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > 1>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.out > 2>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.err > > > > Container killed on request. Exit code is 143 > > Container exited with a non-zero exit code 143 > > Failing this attempt. Failing the application. > > 2017-02-17 15:52:32,164 WARN org.apache.flink.yarn. > YarnClusterClient - If log aggregation is activated > in the Hadoop cluster, we recommend to retrieve the full application log > using this command: > > yarn logs -applicationId application_1487247313588_0016 > > (It sometimes takes a few seconds until the logs are aggregated) > > 2017-02-17 15:52:32,164 INFO org.apache.flink.yarn. > YarnClusterClient - YARN Client is shutting down > > 2017-02-17 15:52:32,267 INFO org.apache.flink.yarn. > ApplicationClient - Stopped Application client. > > 2017-02-17 15:52:32,267 INFO org.apache.flink.yarn. > ApplicationClient - Disconnect from JobManager null= . > > > > > > *=E5=8F=91=E4=BB=B6=E4=BA=BA:* Bruno Aranda [mailto:brunoaranda@gmail.com= ] > *=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:* 2017=E5=B9=B42=E6=9C=8817=E6=97= =A5 17:02 > *=E6=94=B6=E4=BB=B6=E4=BA=BA:* user@flink.apache.org > *=E4=B8=BB=E9=A2=98:* Re: Can't run flink on yarn on version 1.2.0 > > > > Hi Howard, > > > > We run Flink 1.2 in Yarn without issues. Sorry I don't have any specific > solution, but are you sure you don't have some sort of Flink mix? In your > logs I can see: > > > > *The configuration directory ('/home/software/flink-1.1.4/conf') contains > both LOG4J and Logback configuration files. Please delete or rename one o= f > them.* > > > > Where it mentions 1.1.4 in the folder for the conf dir instead of 1.2. > > > > Cheers, > > > > Bruno > > > > On Fri, 17 Feb 2017 at 08:50 Howard,Li(vip.com) > wrote: > > Hi, > > I=E2=80=99m trying to run flink on yarn by using command: bin/fl= ink run > -m yarn-cluster -yn 2 -ys 4 ./examples/batch/WordCount.jar > > But I got the following error: > > > > 2017-02-17 15:52:40,746 INFO org.apache.flink.yarn.cli. > FlinkYarnSessionCli - No path for the flink jar passed. > Using the location of class org.apache.flink.yarn.YarnClusterDescriptor > to locate the jar > > 2017-02-17 15:52:40,746 INFO org.apache.flink.yarn.cli. > FlinkYarnSessionCli - No path for the flink jar passed. > Using the location of class org.apache.flink.yarn.YarnClusterDescriptor > to locate the jar > > 2017-02-17 15:52:40,775 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Using values: > > 2017-02-17 15:52:40,775 INFO org.apache.flink.yarn. > YarnClusterDescriptor - TaskManager count =3D 2 > > 2017-02-17 15:52:40,775 INFO org.apache.flink.yarn. > YarnClusterDescriptor - JobManager memory =3D 1= 024 > > 2017-02-17 15:52:40,775 INFO org.apache.flink.yarn. > YarnClusterDescriptor - TaskManager memory =3D > 1024 > > 2017-02-17 15:52:40,796 INFO org.apache.hadoop.yarn.client. > RMProxy - Connecting to ResourceManager at / > 0.0.0.0:8032 > > 2017-02-17 15:52:41,680 WARN org.apache.flink.yarn. > YarnClusterDescriptor - The configuration directory > ('/home/software/flink-1.1.4/conf') contains both LOG4J and Logback > configuration files. Please delete or rename one of them. > > 2017-02-17 15:52:41,702 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.1.4/conf/logback.xml to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0017/logback.xml > > 2017-02-17 15:52:42,025 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.1.4/lib to hdfs://10.199.202.161:9000/ > user/root/.flink/application_1487247313588_0017/lib > > 2017-02-17 15:52:42,695 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.1.4/conf/log4j.properties to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_1487247313588_0017/log4j= . > properties > > 2017-02-17 15:52:42,722 INFO org.apache.flink.yarn.Utils > - Copying from > file:/home/software/flink-1.1.4/lib/flink-dist_2.10-1.1.4.jar to hdfs:// > 10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0017/flink-dist_2.10-1.1.4.jar > > 2017-02-17 15:52:43,346 INFO org.apache.flink.yarn.Utils > - Copying from /home/software/flink-1.1.4= /conf/flink-conf.yaml > to hdfs://10.199.202.161:9000/user/root/.flink/application_ > 1487247313588_0017/flink-conf.yaml > > 2017-02-17 15:52:43,386 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Submitting application master > application_1487247313588_0017 > > 2017-02-17 15:52:43,425 INFO org.apache.hadoop.yarn.client. > api.impl.YarnClientImpl - Submitted application > application_1487247313588_0017 > > 2017-02-17 15:52:43,425 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Waiting for the cluster to be > allocated > > 2017-02-17 15:52:43,427 INFO org.apache.flink.yarn. > YarnClusterDescriptor - Deploying cluster, current > state ACCEPTED > > 2017-02-17 15:52:48,471 INFO org.apache.flink.yarn. > YarnClusterDescriptor - YARN application has been > deployed successfully. > > Cluster started: Yarn cluster with application id > application_1487247313588_0017 > > Using address 10.199.202.162:43809 to connect to JobManager. > > JobManager web interface address http://vip-rc-ucsww.vclound. > com:8088/proxy/application_1487247313588_0017/ > > Using the parallelism provided by the remote cluster (8). To use another > parallelism, set it at the ./bin/flink client. > > Starting execution of program > > 2017-02-17 15:52:49,278 INFO org.apache.flink.yarn. > YarnClusterClient - Starting program in interactive > mode > > Executing WordCount example with default input data set. > > Use --input to specify file input. > > Printing result to stdout. Use --output to specify output path. > > 2017-02-17 15:52:49,609 INFO org.apache.flink.yarn. > YarnClusterClient - Waiting until all TaskManagers > have connected > > Waiting until all TaskManagers have connected > > 2017-02-17 15:52:49,610 INFO org.apache.flink.yarn.YarnClusterClient > - Starting client actor system. > > > > ------------------------------------------------------------ > > The program finished with the following exception: > > > > org.apache.flink.client.program.ProgramInvocationException: The main > method caused an error. > > at org.apache.flink.client.program.PackagedProgram.callMainMethod( > PackagedProgram.java:525) > > at org.apache.flink.client.program.PackagedProgram. > invokeInteractiveModeForExecution(PackagedProgram.java:404) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:321) > > at org.apache.flink.client.CliFrontend.executeProgram( > CliFrontend.java:777) > > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:253) > > at org.apache.flink.client.CliFrontend.parseParameters( > CliFrontend.java:1005) > > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048) > > Caused by: java.lang.RuntimeException: Unable to get ClusterClient status > from Application Client > > at org.apache.flink.yarn.YarnClusterClient.getClusterStatus( > YarnClusterClient.java:242) > > at org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady( > YarnClusterClient.java:514) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:395) > > at org.apache.flink.yarn.YarnClusterClient.submitJob( > YarnClusterClient.java:204) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:383) > > at org.apache.flink.client.program.ClusterClient.run( > ClusterClient.java:370) > > at org.apache.flink.client.program.ContextEnvironment. > execute(ContextEnvironment.java:62) > > at org.apache.flink.api.java.ExecutionEnvironment.execute( > ExecutionEnvironment.java:896) > > at org.apache.flink.api.java.DataSet.collect(DataSet.java:410) > > at org.apache.flink.api.java.DataSet.print(DataSet.java:1605) > > at org.apache.flink.examples.java.wordcount.WordCount.main( > WordCount.java:92) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.flink.client.program.PackagedProgram.callMainMethod( > PackagedProgram.java:510) > > ... 6 more > > Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalExcept= ion: > Could not retrieve the leader gateway > > at org.apache.flink.runtime.util.LeaderRetrievalUtils. > retrieveLeaderGateway(LeaderRetrievalUtils.java:127) > > at org.apache.flink.client.program.ClusterClient. > getJobManagerGateway(ClusterClient.java:645) > > at org.apache.flink.yarn.YarnClusterClient.getClusterStatus( > YarnClusterClient.java:237) > > ... 21 more > > Caused by: java.util.concurrent.TimeoutException: Futures timed out after > [10000 milliseconds] > > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise. > scala:219) > > at scala.concurrent.impl.Promise$DefaultPromise.result(Promise. > scala:223) > > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) > > at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn( > BlockContext.scala:53) > > at scala.concurrent.Await$.result(package.scala:107) > > at scala.concurrent.Await.result(package.scala) > > at org.apache.flink.runtime.util.LeaderRetrievalUtils. > retrieveLeaderGateway(LeaderRetrievalUtils.java:125) > > ... 23 more > > 2017-02-17 15:53:20,084 INFO org.apache.flink.yarn. > YarnClusterClient - Sending shutdown request to the > Application Master > > 2017-02-17 15:53:20,085 INFO org.apache.flink.yarn. > YarnClusterClient - Start application client. > > 2017-02-17 15:53:20,088 WARN org.apache.flink.yarn. > YarnClusterClient - YARN reported application state > FAILED > > 2017-02-17 15:53:20,089 WARN org.apache.flink.yarn. > YarnClusterClient - Diagnostics: Application > application_1487247313588_0017 failed 1 times due to AM Container for > appattempt_1487247313588_0017_000001 exited with exitCode: -103 > > For more detailed output, check application tracking page: > http://vip-rc-ucsww.vclound.com:8088/cluster/app/ > application_1487247313588_0017Then, click on links to logs of each > attempt. > > Diagnostics: Container [pid=3D18733,containerID=3D > container_1487247313588_0017_01_000001] is running beyond virtual memory > limits. Current usage: 264.7 MB of 1 GB physical memory used; 2.2 GB of 2= .1 > GB virtual memory used. Killing container. > > Dump of the process-tree for container_1487247313588_0017_01_000001 : > > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > > |- 18740 18733 18733 18733 (java) 955 64 2298933248 > <(229)%20893-3248> 67430 /home/software/jdk1.8.0_111/bin/java -Xmx424M > -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/ > application_1487247313588_0017/container_1487247313588_0017_01_000001/job= manager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > > |- 18733 18731 18733 18733 (bash) 0 0 108605440 334 /bin/bash -c > /home/software/jdk1.8.0_111/bin/java -Xmx424M -Dlog.file=3D/home/softwar= e/ > hadoop-2.7.3/logs/userlogs/application_1487247313588_ > 0017/container_1487247313588_0017_01_000001/jobmanager.log > -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration=3Dfi= le:log4j.properties > org.apache.flink.yarn.YarnApplicationMasterRunner > 1>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0017/container_1487247313588_0017_01_000001/jobmanager.out > 2>/home/software/hadoop-2.7.3/logs/userlogs/application_ > 1487247313588_0017/container_1487247313588_0017_01_000001/jobmanager.err > > > > Container killed on request. Exit code is 143 > > Container exited with a non-zero exit code 143 > > Failing this attempt. Failing the application. > > 2017-02-17 15:53:20,102 INFO org.apache.flink.yarn. > ApplicationClient - Notification about new leader > address akka.tcp://flink@10.199.202.162:43809/user/jobmanager with > session ID null. > > 2017-02-17 15:53:20,106 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:53:20,107 INFO org.apache.flink.yarn. > ApplicationClient - Received address of new leader > akka.tcp://flink@10.199.202.162:43809/user/jobmanager with session ID > null. > > 2017-02-17 15:53:20,108 INFO org.apache.flink.yarn. > ApplicationClient - Disconnect from JobManager null= . > > 2017-02-17 15:53:20,112 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager. > > Listening for transport dt_socket at address: 5006 > > 2017-02-17 15:53:20,624 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager. > > 2017-02-17 15:53:21,124 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:53:21,645 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager. > > 2017-02-17 15:53:22,145 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:53:23,165 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:53:23,664 INFO org.apache.flink.yarn. > ApplicationClient - Trying to register at > JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager. > > 2017-02-17 15:53:24,185 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > 2017-02-17 15:53:25,204 INFO org.apache.flink.yarn. > ApplicationClient - Sending StopCluster request to > JobManager. > > > > The main error is : org.apache.flink.runtime.leaderretrieval.LeaderRetrie= valException: > Could not retrieve the leader gateway=E3=80=82May be It have some relatio= nship > with https://issues.apache.org/jira/browse/FLINK-2821. It is said that IP > will always take place in akka address, but not hostnames. But I find > hostname in akka address in leaderRetrievalService. > > > > This problem won=E2=80=99t appear in 1.1.4. > > > > Thank you all. > > > > Howard > > =E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=AF=E8=83=BD=E4=B8=BA= =E4=BF=9D=E5=AF=86=E6=96=87=E4=BB=B6=E3=80=82=E5=A6=82=E6=9E=9C=E9=98=81=E4= =B8=8B=E9=9D=9E=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E6=89=80=E6=8C=87=E5=AE= =9A=E4=B9=8B=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=8C=E8=B0=A8=E8=AF=B7=E7=AB=8B= =E5=8D=B3=E9=80=9A=E7=9F=A5=E6=9C=AC=E4=BA=BA=E3=80=82=E6=95=AC=E8=AF=B7=E9= =98=81=E4=B8=8B=E4=B8=8D=E8=A6=81=E4=BD=BF=E7=94=A8=E3=80=81=E4=BF=9D=E5=AD= =98=E3=80=81=E5=A4=8D=E5=8D=B0=E3=80=81=E6=89=93=E5=8D=B0=E3=80=81=E6=95=A3= =E5=B8=83=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=8A=E5=85=B6=E5= =86=85=E5=AE=B9=EF=BC=8C > =E6=88=96=E5=B0=86=E5=85=B6=E7=94=A8=E4=BA=8E=E5=85=B6=E4=BB=96=E4=BB=BB= =E4=BD=95=E7=9B=AE=E7=9A=84=E6=88=96=E5=90=91=E4=BB=BB=E4=BD=95=E4=BA=BA=E6= =8A=AB=E9=9C=B2=E3=80=82=E8=B0=A2=E8=B0=A2=E6=82=A8=E7=9A=84=E5=90=88=E4=BD= =9C=EF=BC=81 This communication is intended only for the > addressee(s) and may contain information that is privileged and > confidential. You are hereby notified that, if you are not an intended > recipient listed above, or an authorized employee or agent of an addresse= e > of this communication responsible for delivering e-mail messages to an > intended recipient, any dissemination, distribution or reproduction of th= is > communication (including any attachments hereto) is strictly prohibited. = If > you have received this communication in error, please notify us immediate= ly > by a reply e-mail addressed to the sender and permanently delete the > original e-mail communication and any attachments from all storage device= s > without making or otherwise retaining a copy. > > =E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=AF=E8=83=BD=E4=B8=BA= =E4=BF=9D=E5=AF=86=E6=96=87=E4=BB=B6=E3=80=82=E5=A6=82=E6=9E=9C=E9=98=81=E4= =B8=8B=E9=9D=9E=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E6=89=80=E6=8C=87=E5=AE= =9A=E4=B9=8B=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=8C=E8=B0=A8=E8=AF=B7=E7=AB=8B= =E5=8D=B3=E9=80=9A=E7=9F=A5=E6=9C=AC=E4=BA=BA=E3=80=82=E6=95=AC=E8=AF=B7=E9= =98=81=E4=B8=8B=E4=B8=8D=E8=A6=81=E4=BD=BF=E7=94=A8=E3=80=81=E4=BF=9D=E5=AD= =98=E3=80=81=E5=A4=8D=E5=8D=B0=E3=80=81=E6=89=93=E5=8D=B0=E3=80=81=E6=95=A3= =E5=B8=83=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=8A=E5=85=B6=E5= =86=85=E5=AE=B9=EF=BC=8C=E6=88=96=E5=B0=86=E5=85=B6=E7=94=A8=E4=BA=8E=E5=85= =B6=E4=BB=96=E4=BB=BB=E4=BD=95=E7=9B=AE=E7=9A=84=E6=88=96=E5=90=91=E4=BB=BB= =E4=BD=95=E4=BA=BA=E6=8A=AB=E9=9C=B2=E3=80=82=E8=B0=A2=E8=B0=A2=E6=82=A8=E7= =9A=84=E5=90=88=E4=BD=9C=EF=BC=81 > This communication is intended only for the addressee(s) and may contain > information that is privileged and confidential. You are hereby notified > that, if you are not an intended recipient listed above, or an authorized > employee or agent of an addressee of this communication responsible for > delivering e-mail messages to an intended recipient, any dissemination, > distribution or reproduction of this communication (including any > attachments hereto) is strictly prohibited. If you have received this > communication in error, please notify us immediately by a reply e-mail > addressed to the sender and permanently delete the original e-mail > communication and any attachments from all storage devices without making > or otherwise retaining a copy. > --001a11470abc8a74c20548b7711a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Howard,

could you check whether the = JobManager's actor system was bound to "vip-rc-vsubu.vclound.com:55926"? You shoul= d see that in the job manager logs. Furthermore, have you checked that you = Yarn cluster nodes are actually reachable from the node where you start the= Flink application? If so, the logs of the cli client as well as the JobMan= ager logs (both on debug level) would be tremendously helpful.
Cheers,
Till
<= br>
On Fri, Feb 17, 2017 at 10:41 AM, Howard,Li(<= a href=3D"http://vip.com">vip.com) <howard.li@vipshop.com><= /span> wrote:

Sorry for = the confusion I made. I do copy the wrong log, but we do meet this problem = on 1.2.0.

for versio= n 1.1.4 however, we meet this in one cluster but not in another. We are sti= ll trying to figure out what happened.

=C2= =A0

The follow= ing is the log for 1.2.0 version:

=C2= =A0

2017-02-17= 15:51:37,775 INFO=C2=A0 org.apache.flink.yarn.cli.FlinkYarnSessionCli= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 - No path for the flink jar passed. Using the l= ocation of class org.apache.flink.yarn.YarnClusterDescriptor to locate the ja= r

2017-02-17= 15:51:37,775 INFO=C2=A0 org.apache.flink.yarn.cli.FlinkYarnSessionCli= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 - No path for the flink jar passed. Using the l= ocation of class org.apache.flink.yarn.YarnClusterDescriptor to locate the ja= r

2017-02-17= 15:51:37,803 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0- Using values:

2017-02-17= 15:51:37,804 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0 TaskManager count = =3D 2

2017-02-17= 15:51:37,804 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0 JobManager memory = =3D 1024

2017-02-17= 15:51:37,804 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0 TaskManager memory = =3D 1024

2017-02-17= 15:51:37,827 INFO=C2=A0 org.apache.hadoop.yarn.client.RMProxy=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Conn= ecting to ResourceManager at /0.0.0.0:8032

2017-02-17= 15:51:38,672 WARN=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - The configuration directory ('= ;/home/software/flink-1.2.0/conf') contains both LOG4J and Logback configuration files. Please delete or rena= me one of them.

2017-02-17= 15:51:38,685 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from file:/home/softwar= e/flink-1.2.0/examples/batch/WordCount.jar to hdfs://10.199.202.161:9000/= user/root/.flink/application_1487247313588_0016/WordCount.ja= r

2017-02-17= 15:51:38,992 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from file:/home/softwar= e/flink-1.2.0/conf/log4j.properties to hdfs://10.199.202.161:90= 00/user/root/.flink/application_1487247313588_0016/log4j.pro= perties

2017-02-17= 15:51:39,058 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from file:/home/softwar= e/flink-1.2.0/conf/logback.xml to hdfs://10.199.202.161:9000/user/root/.flink/application_1487247313588_0016/logback.xml<= /u>

2017-02-17= 15:51:39,085 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from file:/home/softwar= e/flink-1.2.0/lib to hdfs://10.199.202.161:9000/user/= root/.flink/application_1487247313588_0016/lib

2017-02-17= 15:51:39,695 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from file:/home/softwar= e/flink-1.2.0/lib/flink-dist_2.11-1.2.0.jar to hdfs://10.199.2= 02.161:9000/user/root/.flink/application_1487247313588_0016/flink= -dist_2.11-1.2.0.jar

2017-02-17= 15:51:40,493 INFO=C2=A0 org.apache.flink.yarn.Utils=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying from /home/software/fli= nk-1.2.0/conf/flink-conf.yaml to hdfs://10.199.202.161:900= 0/user/root/.flink/application_1487247313588_0016/flink-conf.yaml

2017-02-17= 15:51:40,547 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Submitting application master app= lication_1487247313588_0016

2017-02-17= 15:51:40,585 INFO=C2=A0 org.apache.hadoop.yarn.client.api.impl.YarnCl= ientImpl=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Submitted a= pplication application_1487247313588_0016

2017-02-17= 15:51:40,585 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Waiting for the cluster to be all= ocated

2017-02-17= 15:51:40,587 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Deploying cluster, current state = ACCEPTED

2017-02-17= 15:51:45,879 INFO=C2=A0 org.apache.flink.yarn.YarnClusterDescriptor= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - YARN application has been deploye= d successfully.

Cluster st= arted: Yarn cluster with application id application_1487247313588_0016

Using addr= ess vip= -rc-vsubu.vclound.com:55926 to connect to JobManager.

JobManager= web interface address http://vip-rc-ucsww.vc= lound.com:8088/proxy/application_1487247313588_0016/

Using the = parallelism provided by the remote cluster (8). To use another parallelism,= set it at the ./bin/flink client.

Starting e= xecution of program

201= 7-02-17 15:51:46,704 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClien= t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Startin= g program in interactive mode

Executing = WordCount example with default input data set.

Use --inpu= t to specify file input.

Printing r= esult to stdout. Use --output to specify output path.<= /p>

201= 7-02-17 15:51:47,029 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClien= t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Waiting= until all TaskManagers have connected

Waiting un= til all TaskManagers have connected

201= 7-02-17 15:51:47,029 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClien= t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Startin= g client actor system.

=C2= =A0

----------= --------------------------------------------------

The progra= m finished with the following exception:

=C2= =A0

org.apache= .flink.client.program.ProgramInvocationException: The main method= caused an error.

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:54= 5)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.PackagedProgram.invokeInteractiveModeForExecution(PackagedPr= ogram.java:419)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.run(ClusterClient.java:339)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.executeProgram(CliFrontend.java:831)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.run(CliFrontend.java:256)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.parseParameters(CliFrontend.java:1073)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend$2.call(CliFrontend.java:1120)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend$2.call(CliFrontend.java:1117)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.se= curity.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)=

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.security.AccessControl= ler.doPrivileged(Native Method)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at javax.security.auth.Subject.doAs(Subject.java:422)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.security.= UserGroupInformation.doAs(UserGroupInformation.java:1657)<= u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.se= curity.HadoopSecurityContext.runSecured(HadoopSecurityContex= t.java:40)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.main(CliFrontend.java:1116)

Caused by:= java.lang.RuntimeException: Unable to get ClusterClient status from Applic= ation Client

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnC= lusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)<= u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.run(ClusterClient.java:412)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnC= lusterClient.submitJob(YarnClusterClient.java:210)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.run(ClusterClient.java:400)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.run(ClusterClient.java:387)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ContextEnvironment.execute(ContextEnvironment.java:62)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.E= xecutionEnvironment.execute(ExecutionEnvironment.java:926)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.D= ataSet.collect(DataSet.java:410)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.D= ataSet.print(DataSet.java:1605)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.examples.j= ava.wordcount.WordCount.main(WordCount.java:92)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAcc= essorImpl.invoke0(Native Method)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAcc= essorImpl.invoke(NativeMethodAccessorImpl.java:62)=

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.DelegatingMetho= dAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.reflect.Method.in= voke(Method.java:498)

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:52= 8)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 13 more

Caused by:= org.apache.flink.runtime.leaderretrieval.LeaderRetrievalExceptio= n: Could not retrieve the leader gateway

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.u= til.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetriev= alUtils.java:141)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.getJobManagerGateway(ClusterClient.java:691)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnC= lusterClient.getClusterStatus(YarnClusterClient.java:242)<= u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 28 more

Caused by:= java.util.concurrent.TimeoutException: Futures timed out after [10000= milliseconds]

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$.res= ult(package.scala:190)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await.result(package.scala)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtil= s.java:139)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 30 more

2017-02-17= 15:52:21,145 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending shutdown= request to the Application Master

2017-02-17= 15:52:21,145 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Start applicatio= n client.

2017-02-17= 15:52:21,151 WARN=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - YARN reported ap= plication state FAILED

2017-02-17= 15:52:21,152 WARN=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Diagnostics: App= lication application_1487247313588_0016 failed 1 times due to AM Container for appattempt_1487247313588_0016_= 000001 exited with=C2=A0 exitCode: -103

For more d= etailed output, check application tracking page:http://vip-rc-ucsww.vclound.com:8088/cluster/app/appl= ication_1487247313588_0016Then, click on links to logs of each attempt.

Diagnostic= s: Container [pid=3D18590,containerID=3Dcontainer_1487247313588_0016_<= wbr>01_000001] is running beyond virtual memory limits. Current usage: 266.= 1 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Kil= ling container.

Dump of th= e process-tree for container_1487247313588_0016_01_000001 :<= /u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- PID PPID PGRPID SESSID CMD_NAME = USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(P= AGES) FULL_CMD_LINE

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- 18598 18590 18590 18590 (j= ava) 894 48 2294116352 67782 /home/software/jdk1.8.0_111/bin/java = -Xmx424M -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/a= pplication_1487247313588_0016/container_1487247313588_0016_01_000= 001/jobmanager.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- 18590 18588 18590 18590 (bash) 0= 0 108605440 334 /bin/bash -c /home/software/jdk1.8.0_111/bin/java -Xm= x424M=C2=A0 -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/application_1487247313588_0016/container_1487247313588_0016_01_= 000001/jobmanager.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner=C2=A0 1>/home/software/hadoop-2.7.3/logs/userlogs/applica= tion_1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.out 2>/home/software/hadoop-2.7.3/logs/userlogs/application_14872= 47313588_0016/container_1487247313588_0016_01_000001/jobmanager.e= rr

=C2= =A0

Container = killed on request. Exit code is 143

Container = exited with a non-zero exit code 143

Failing th= is attempt. Failing the application.

201= 7-02-17 15:52:21,160 INFO=C2=A0 org.apache.flink.yarn.ApplicationClien= t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Notific= ation about new leader address akka.tcp://flink@vip-rc-vsubu= .vclound.com:55926/user/jobmanager with session ID null.

2017-02-17= 15:52:21,163 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:21,164 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Received address= of new leader akka.tcp://flink@vip-rc-vsubu.vclound.co= m:55926/user/jobmanager with session ID null.

2017-02-17= 15:52:21,165 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Disconnect from = JobManager null.

2017-02-17= 15:52:21,168 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0- Trying to register = at JobManager akka.tcp://flink@vip-rc-vsubu.vclound.com= :55926/user/jobmanager.

2017-02-17= 15:52:21,684 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Trying to regist= er at JobManager akka.tcp://flink@vip-rc-vsubu.vclound.= com:55926/user/jobmanager.

2017-02-17= 15:52:22,174 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:22,704 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Trying to regist= er at JobManager akka.tcp://flink@vip-rc-vsubu.vclound.= com:55926/user/jobmanager.

2017-02-17= 15:52:23,194 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:24,214 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:24,725 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Trying to regist= er at JobManager akka.tcp://flink@vip-rc-vsubu.vclound.= com:55926/user/jobmanager.

2017-02-17= 15:52:25,234 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:26,254 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:27,274 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:28,294 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:28,744 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Trying to regist= er at JobManager akka.tcp://flink@vip-rc-vsubu.vclound.= com:55926/user/jobmanager.

2017-02-17= 15:52:29,314 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:30,334 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:31,155 WARN=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Error while stop= ping YARN cluster.

java.util.= concurrent.TimeoutException: Futures timed out after [10000 millisecon= ds]

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Prom= ise$DefaultPromise.ready(Promise.scala:153)<= /p>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$$ano= nfun$ready$1.apply(package.scala:169)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$$ano= nfun$ready$1.apply(package.scala:169)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$.ready(package.scala:169)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await.ready(package.scala)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnC= lusterClient.shutdownCluster(YarnClusterClient.java:372)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnC= lusterClient.finalizeCluster(YarnClusterClient.java:342)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.pro= gram.ClusterClient.shutdown(ClusterClient.java:208)=

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.run(CliFrontend.java:263)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.parseParameters(CliFrontend.java:1073)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend$2.call(CliFrontend.java:1120)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend$2.call(CliFrontend.java:1117)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.se= curity.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)=

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.security.AccessControl= ler.doPrivileged(Native Method)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at javax.security.auth.Subject.doAs(Subject.java:422)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.security.= UserGroupInformation.doAs(UserGroupInformation.java:1657)<= u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.se= curity.HadoopSecurityContext.runSecured(HadoopSecurityContex= t.java:40)

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.Cli= Frontend.main(CliFrontend.java:1116)

2017-02-17= 15:52:31,156 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Deleting files i= n hdfs://10.199.202.161:9000/user/root/.= flink/application_1487247313588_0016

2017-02-17= 15:52:31,354 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Sending StopClus= ter request to JobManager.

2017-02-17= 15:52:32,163 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Application appl= ication_1487247313588_0016 finished with state FAILED and final state FAILED at 1487317906227

2017-02-17= 15:52:32,163 WARN=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Application fail= ed. Diagnostics Application application_1487247313588_0016 failed 1 times due to AM Container for appattempt_1487247313588_0016_= 000001 exited with=C2=A0 exitCode: -103

For more d= etailed output, check application tracking page:http://vip-rc-ucsww.vclound.com:8088/cluster/app/appl= ication_1487247313588_0016Then, click on links to logs of each attempt.

Diagnostic= s: Container [pid=3D18590,containerID=3Dcontainer_1487247313588_0016_<= wbr>01_000001] is running beyond virtual memory limits. Current usage: 266.= 1 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Kil= ling container.

Dump of th= e process-tree for container_1487247313588_0016_01_000001 :<= /u>

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- PID PPID PGRPID SESSID CMD_NAME = USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(P= AGES) FULL_CMD_LINE

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- 18598 18590 18590 18590 (j= ava) 894 48 2294116352 67782 /home/software/jdk1.8.0_111/bin/java = -Xmx424M -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/a= pplication_1487247313588_0016/container_1487247313588_0016_01_000= 001/jobmanager.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |- 18590 18588 18590 18590 (bash) 0= 0 108605440 334 /bin/bash -c /home/software/jdk1.8.0_111/bin/java -Xm= x424M=C2=A0 -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/application_1487247313588_0016/container_1487247313588_0016_01_= 000001/jobmanager.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner=C2=A0 1>/home/software/hadoop-2.7.3/logs/userlogs/applica= tion_1487247313588_0016/container_1487247313588_0016_01_000001/jobmanager.out 2>/home/software/hadoop-2.7.3/logs/userlogs/application_14872= 47313588_0016/container_1487247313588_0016_01_000001/jobmanager.e= rr

=C2= =A0

Container = killed on request. Exit code is 143

Container = exited with a non-zero exit code 143

Failing th= is attempt. Failing the application.

201= 7-02-17 15:52:32,164 WARN=C2=A0 org.apache.flink.yarn.YarnClusterClien= t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - If log = aggregation is activated in the Hadoop cluster, we recommend to retrieve the full application log using this command:

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 yarn logs -applicationId applicatio= n_1487247313588_0016

(It someti= mes takes a few seconds until the logs are aggregated)=

2017-02-17= 15:52:32,164 INFO=C2=A0 org.apache.flink.yarn.YarnClusterClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - YARN Client is s= hutting down

2017-02-17= 15:52:32,267 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Stopped Applicat= ion client.

2017-02-17= 15:52:32,267 INFO=C2=A0 org.apache.flink.yarn.ApplicationClient=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Disconnect from = JobManager null.

=C2= =A0

=C2= =A0

=E5=8F=91=E4=BB= =B6=E4=BA=BA: Bruno Aranda [mailto:brunoaranda@gmail.com]
=E5=8F=91=E9=80=81=E6=97=B6=E9= =97=B4: 2017=E5=B9=B42=E6=9C=8817=E6=97=A5= 17:02
=E6=94=B6=E4=BB=B6=E4=BA=BA: user@flink.apache.org
=E4=B8=BB=E9=A2=98: Re: Can't run flink on yarn on version 1.2.0

=C2=A0

Hi Howard,=

=C2=A0

We run Flink 1.2 in Yarn withou= t issues. Sorry I don't have any specific solution, but are you sure yo= u don't have some sort of Flink mix? In your logs I can see:<= /u>

=C2=A0

The configuration dir= ectory ('/home/software/flink-1.1.4/conf') contains both LOG4J= and Logback configuration files. Please delete or rename one of them.

=C2=A0

Where it mentions 1.1.4 in the = folder for the conf dir instead of 1.2.

=C2=A0

Cheers,

=C2=A0

Bruno

=C2=A0

On Fri, 17 Feb 2017 at 08:50 Ho= ward,Li(vip.com) <howard.li@vipshop.com= > wrote:

Hi,=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 I=E2=80=99= m trying to run flink on yarn by using command: bin/flink run -m yarn-clust= er -yn 2 -ys 4 ./examples/batch/WordCount.jar

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 But I got = the following error:

=C2=A0

2017-02-17 15:52:40,746 INFO=C2=A0 org.apache.flink.yarn.cli.FlinkYarnSessionCli=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - No path for the flink jar= passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar

2017-02-17 15:52:40,746 INFO=C2=A0 org.apache.flink.yarn.cli.FlinkYarnSessionCli=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - No path for the flink jar= passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar

2017-02-17 15:52:40,775 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Using values:

2017-02-17 15:52:40,775 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 TaskManager count =3D 2

2017-02-17 15:52:40,775 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 JobManager memory =3D 1024

2017-02-17 15:52:40,775 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 TaskManager memory =3D 1024

2017-02-17 15:52:40,796 INFO=C2=A0 org.apache.hadoop.yarn.client= .RMProxy=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 - Connecting to ResourceManager at /0.0.0.0:8032

2017-02-17 15:52:41,680 WARN=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - The configurati= on directory ('/home/software/flink-1.1.4/conf') contains both LOG4J and Logback configuration files. Please delete or rena= me one of them.

2017-02-17 15:52:41,702 INFO=C2=A0 org.apache.flink.yarn.Utils= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying= from file:/home/software/flink-1.1.4/conf/logback.xml to hdfs://10.199.202.161:9000/user/root/.flink/application_1487247313588_0017/logback.xml

2017-02-17 15:52:42,025 INFO=C2=A0 org.apache.flink.yarn.Utils= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying= from file:/home/software/flink-1.1.4/lib to hdfs://10.199.202.161:9000/user/root/.flink/application_= 1487247313588_0017/lib=

2017-02-17 15:52:42,695 INFO=C2=A0 org.apache.flink.yarn.Utils= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying= from file:/home/software/flink-1.1.4/conf/log4j.properties to hdfs://10.199.202.161:90= 00/user/root/.flink/application_1487247313588_0017/log4j.pro= perties

2017-02-17 15:52:42,722 INFO=C2=A0 org.apache.flink.yarn.Utils= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying= from file:/home/software/flink-1.1.4/lib/flink-dist_2.10-1.1.4.j= ar to hdfs://10.199.2= 02.161:9000/user/root/.flink/application_1487247313588_0017/flink= -dist_2.10-1.1.4.jar<= /u>

2017-02-17 15:52:43,346 INFO=C2=A0 org.apache.flink.yarn.Utils= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Copying= from /home/software/flink-1.1.4/conf/flink-conf.yaml to hdfs://10.199.202.161:900= 0/user/root/.flink/application_1487247313588_0017/flink-conf.yaml

2017-02-17 15:52:43,386 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Submitting appl= ication master application_1487247313588_0017

2017-02-17 15:52:43,425 INFO=C2=A0 org.apache.hadoop.yarn.client= .api.impl.YarnClientImpl=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Submitted application application_1487247313588_0017

2017-02-17 15:52:43,425 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Waiting for the= cluster to be allocated

2017-02-17 15:52:43,427 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - Deploying clust= er, current state ACCEPTED=

2017-02-17 15:52:48,471 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterDescriptor=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 - YARN applicatio= n has been deployed successfully.=

Cluster started: Yarn cluster with application id application_14= 87247313588_0017

Using address 10.199.202.162:43= 809 to connect to JobManager.=

JobManager web interface address http://vip-rc-ucsww.vclound.com:8088/proxy/application_1487247313= 588_0017/

Using the parallelism provided by the remote cluster (8). To use= another parallelism, set it at the ./bin/flink client.

Starting execution of program=

2017-02-17 15:52:49,278 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Starting program in interactive mode

Executing WordCount example with default input data set.<= /span>

Use --input to specify file input.

Printing result to stdout. Use --output to specify output path.<= /span>

2017-02-17 15:52:49,609 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Waiting until all TaskManagers have connected

Waiting until all TaskManagers have connected

2017-02-17 15:52:49,610 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= - Starting client actor system.

=C2=A0

-----------------------------------------------------------= -

The program finished with the following exception:=

=C2=A0

org.apache.flink.client.program.ProgramInvocationExcep= tion: The main method caused an error.

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .PackagedProgram.callMainMethod(PackagedProgram.java:525)<= /span>

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .PackagedProgram.invokeInteractiveModeForExecution(PackagedProgra= m.java:404)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ClusterClient.run(ClusterClient.java:321)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.CliFron= tend.executeProgram(CliFrontend.java:777)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.CliFron= tend.run(CliFrontend.java:253)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.CliFron= tend.parseParameters(CliFrontend.java:1005)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.CliFron= tend.main(CliFrontend.java:1048)=

Caused by: java.lang.RuntimeException: Unable to get ClusterClie= nt status from Application Client=

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnClust= erClient.getClusterStatus(YarnClusterClient.java:242)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnClust= erClient.waitForClusterToBeReady(YarnClusterClient.java:514)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ClusterClient.run(ClusterClient.java:395)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnClust= erClient.submitJob(YarnClusterClient.java:204)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ClusterClient.run(ClusterClient.java:383)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ClusterClient.run(ClusterClient.java:370)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ContextEnvironment.execute(ContextEnvironment.java:62)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.Execu= tionEnvironment.execute(ExecutionEnvironment.java:896)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.DataS= et.collect(DataSet.java:410)=

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.api.java.DataS= et.print(DataSet.java:1605)<= u>

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.examples.java.= wordcount.WordCount.main(WordCount.java:92)

=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAccesso= rImpl.invoke0(Native Method)=

=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAccesso= rImpl.invoke(NativeMethodAccessorImpl.java:62)=

=C2=A0=C2=A0=C2=A0=C2=A0 at sun.reflect.DelegatingMethodAcc= essorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.reflect.Method.invoke= (Method.java:498)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .PackagedProgram.callMainMethod(PackagedProgram.java:510)<= /span>

=C2=A0=C2=A0=C2=A0=C2=A0 ... 6 more

Caused by: org.apache.flink.runtime.leaderretrieval.Le= aderRetrievalException: Could not retrieve the leader gateway=

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.util.L= eaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.ja= va:127)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.client.program= .ClusterClient.getJobManagerGateway(ClusterClient.java:645)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.yarn.YarnClust= erClient.getClusterStatus(YarnClusterClient.java:237)

=C2=A0=C2=A0=C2=A0=C2=A0 ... 21 more

Caused by: java.util.concurrent.TimeoutException: Futures t= imed out after [10000 milliseconds]

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Promise$D= efaultPromise.ready(Promise.scala:219)

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.impl.Promise$D= efaultPromise.result(Promise.scala:223)

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$$anonfun= $result$1.apply(package.scala:107)<= u>

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.BlockContext$D= efaultBlockContext$.blockOn(BlockContext.scala:53)

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await$.result(= package.scala:107)<= /p>

=C2=A0=C2=A0=C2=A0=C2=A0 at scala.concurrent.Await.result(p= ackage.scala)

=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.flink.runtime.util.L= eaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.ja= va:125)

=C2=A0=C2=A0=C2=A0=C2=A0 ... 23 more

2017-02-17 15:53:20,084 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending shutdown request to the Application Master

2017-02-17 15:53:20,085 INFO=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Start application client.

2017-02-17 15:53:20,088 WARN=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - YARN reported application state FAILED

2017-02-17 15:53:20,089 WARN=C2=A0 org.apache.flink.yarn.Ya= rnClusterClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Diagnostics: Application application_1487247313588_0017 failed 1 times due to AM Container for appattempt_1487247313588_0017_= 000001 exited with=C2=A0 exitCode: -103<= u>

For more detailed output, check application tracking page:http://vip-rc-ucsww.vclound.com:8088/= cluster/app/application_1487247313588_0017Then, click on links to logs of each attempt.=

Diagnostics: Container [pid=3D18733,containerID=3Dcontainer= _1487247313588_0017_01_000001] is running beyond virtual memory limits= . Current usage: 264.7 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Kil= ling container.

Dump of the process-tree for container_1487247313588_0017_0= 1_000001 :

=C2=A0=C2=A0=C2=A0=C2=A0 |- PID PPID PGRPID SESSID CMD_NAME USER= _MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES= ) FULL_CMD_LINE

=C2=A0=C2=A0=C2=A0=C2=A0 |- 18740 18733 18733 18733 (java) 955 6= 4 2298933248 67430 /home/software/jdk1.8.0_111/bin/java -Xmx424M -D= log.file=3D/home/software/hadoop-2.7.3/logs/userlogs/application_= 1487247313588_0017/container_1487247313588_0017_01_000001/jobmana= ger.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner

=C2=A0=C2=A0=C2=A0=C2=A0 |- 18733 18731 18733 18733 (bash) 0 0 1= 08605440 334 /bin/bash -c /home/software/jdk1.8.0_111/bin/java -Xmx424= M=C2=A0 -Dlog.file=3D/home/software/hadoop-2.7.3/logs/userlogs/ap= plication_1487247313588_0017/container_1487247313588_0017_01_0000= 01/jobmanager.log -Dlogback.configurationFile=3Dfile:logback.xml -Dlog4j.configuration= =3Dfile:log4j.properties org.apache.flink.yarn.YarnApplicationMas= terRunner=C2=A0 1>/home/software/hadoop-2.7.3/logs/userlogs/applica= tion_1487247313588_0017/container_1487247313588_0017_01_000001/jobmanager.out 2>/home/software/hadoop-2.7.3/logs/userlogs/application_14872= 47313588_0017/container_1487247313588_0017_01_000001/jobmanager.e= rr

=C2=A0

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

Failing this attempt. Failing the application.

2017-02-17 15:53:20,102 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Notification about new leader address akka.tcp://flink@10.199.20= 2.162:43809/user/jobmanager with session ID null.

2017-02-17 15:53:20,106 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

2017-02-17 15:53:20,107 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Received address of new leader akka.tcp://flink@10.199.202.= 162:43809/user/jobmanager with session ID null.

2017-02-17 15:53:20,108 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Disconnect from JobManager null.

2017-02-17 15:53:20,112 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Trying to register at JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager.<= u>

Listening for transport dt_socket at address: 5006=

2017-02-17 15:53:20,624 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Trying to register at JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager.<= u>

2017-02-17 15:53:21,124 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

2017-02-17 15:53:21,645 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Trying to register at JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager.<= u>

2017-02-17 15:53:22,145 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

2017-02-17 15:53:23,165 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

2017-02-17 15:53:23,664 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Trying to register at JobManager akka.tcp://flink@10.199.202.162:43809/user/jobmanager.<= u>

2017-02-17 15:53:24,185 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

2017-02-17 15:53:25,204 INFO=C2=A0 org.apache.flink.yarn.Ap= plicationClient=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 - Sending StopCluster request to JobManager.

=C2=A0

The main error is : org.apache.flink.runtime.leaderret= rieval.LeaderRetrievalException: Could not retrieve the leader gateway= =E3=80=82May be It have some relationship with https://issues.apache.org/jira/browse/FLINK-2821. It is said that = IP will always take place in akka address, but not hostnames. But I find ho= stname in akka address in leaderRetrievalService.

=C2=A0

This problem won=E2=80=99t appear in 1.1.4.

=C2=A0

Thank you all.=

=C2=A0

Howard

=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F= =AF=E8=83=BD=E4=B8=BA=E4=BF=9D=E5=AF=86=E6=96=87=E4=BB=B6=E3=80=82=E5=A6=82= =E6=9E=9C=E9=98=81=E4=B8=8B=E9=9D=9E=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E6= =89=80=E6=8C=87=E5=AE=9A=E4=B9=8B=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=8C= =E8=B0=A8=E8=AF=B7=E7=AB=8B=E5=8D=B3=E9=80=9A=E7=9F=A5=E6=9C=AC=E4=BA=BA=E3= =80=82=E6=95=AC=E8=AF=B7=E9=98=81=E4=B8=8B=E4=B8=8D=E8=A6=81=E4=BD=BF=E7=94= =A8=E3=80=81=E4=BF=9D=E5=AD=98=E3=80=81=E5=A4=8D=E5=8D=B0=E3=80=81=E6=89=93= =E5=8D=B0=E3=80=81=E6=95=A3=E5=B8=83=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82= =AE=E4=BB=B6=E5=8F=8A=E5=85=B6=E5=86=85=E5=AE=B9=EF=BC=8C=E6=88=96=E5= =B0=86=E5=85=B6=E7=94=A8=E4=BA=8E=E5=85=B6=E4=BB=96=E4=BB=BB=E4=BD=95=E7=9B= =AE=E7=9A=84=E6=88=96=E5=90=91=E4=BB=BB=E4=BD=95=E4=BA=BA=E6=8A=AB=E9=9C=B2= =E3=80=82=E8=B0=A2=E8=B0=A2=E6=82=A8=E7=9A=84=E5=90=88=E4=BD=9C=EF=BC=81 This communication is intended only for the addressee(s)= and may contain information that is privileged and confidential. You are hereby notified that, if you are not an intended recipient listed = above, or an authorized employee or agent of an addressee of this communica= tion responsible for delivering e-mail messages to an intended recipient, a= ny dissemination, distribution or reproduction of this communication (including any attachments hereto) is s= trictly prohibited. If you have received this communication in error, pleas= e notify us immediately by a reply e-mail addressed to the sender and perma= nently delete the original e-mail communication and any attachments from all storage devices without making = or otherwise retaining a copy.

=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=AF=E8=83=BD=E4=B8=BA=E4= =BF=9D=E5=AF=86=E6=96=87=E4=BB=B6=E3=80=82=E5=A6=82=E6=9E=9C=E9=98=81=E4=B8= =8B=E9=9D=9E=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E6=89=80=E6=8C=87=E5=AE=9A= =E4=B9=8B=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=8C=E8=B0=A8=E8=AF=B7=E7=AB= =8B=E5=8D=B3=E9=80=9A=E7=9F=A5=E6=9C=AC=E4=BA=BA=E3=80=82=E6=95=AC=E8=AF=B7= =E9=98=81=E4=B8=8B=E4=B8=8D=E8=A6=81=E4=BD=BF=E7=94=A8=E3=80=81=E4=BF=9D=E5= =AD=98=E3=80=81=E5=A4=8D=E5=8D=B0=E3=80=81=E6=89=93=E5=8D=B0=E3=80=81= =E6=95=A3=E5=B8=83=E6=9C=AC=E7=94=B5=E5=AD=90=E9=82=AE=E4=BB=B6=E5=8F=8A=E5= =85=B6=E5=86=85=E5=AE=B9=EF=BC=8C=E6=88=96=E5=B0=86=E5=85=B6=E7=94=A8= =E4=BA=8E=E5=85=B6=E4=BB=96=E4=BB=BB=E4=BD=95=E7=9B=AE=E7=9A=84=E6=88=96=E5= =90=91=E4=BB=BB=E4=BD=95=E4=BA=BA=E6=8A=AB=E9=9C=B2=E3=80=82=E8=B0=A2=E8=B0= =A2=E6=82=A8=E7=9A=84=E5=90=88=E4=BD=9C=EF=BC=81 This communication is inte= nded only for the addressee(s) and may contain information that is privileg= ed and confidential. You are hereby notified that, if you are not an intended recipient listed above, or an authorized employee or agent= of an addressee of this communication responsible for delivering e-mail me= ssages to an intended recipient, any dissemination, distribution or reprodu= ction of this communication (including any attachments hereto) is strictly prohibited. If you have received this = communication in error, please notify us immediately by a reply e-mail addr= essed to the sender and permanently delete the original e-mail communicatio= n and any attachments from all storage devices without making or otherwise retaining a copy.

--001a11470abc8a74c20548b7711a--