flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: some questions about submit flink job on flink-yarn
Date Sun, 15 Jan 2017 12:10:54 GMT
Hi Huang,

the reason why you cannot use the IP address to send messages to your YARN
JobManager is that we no longer resolve the hostname into an IP address.
Instead we start the ActorSystem with the unresolved hostname. You can see
this in the following log line: `Actor system bound to hostname
9-96-101-251`. Since Akka requires that the destination address of a
message matches exactly the address to which the ActorSystem is bound, you
have to use `9-96-101-251:38785`. This was recently changed.

Concerning the ports, Flink chooses a random port for the `JobManager` in
order to avoid port conflicts with other `JobManagers` running on the same
node. With YARN you don't have control over where the `JobManager` is
placed. However, you can use the configuration parameter
`yarn.application-master.port` to specify a port or a port range for the
application master/job manager.

Additionally, the web frontends port is always overwritten and set to 0
which means random port selection when starting a yarn session.

I hope this clarifies things a little bit.

Cheers,
Till

On Sat, Jan 14, 2017 at 2:52 AM, huangwei (G) <huangwei111@huawei.com>
wrote:

> Hi Till,
>
> The "9-96-101-177" is just the hostname.
> I rerun the flink on yarn and here is the jobmanager.log, and sorry for I
> blocked some sensitive log. By the way , the port(another questions in my
> earlier mail) seems to be a random value(this time is 38785 and 35699).
> I used flink-1.2.0, it works well on the yarn which is provided by apache
> open source. But I run flink on a special yarn which was token some safety
> reinforcement based on apache-yarn. I just have no idea about the ERROR log.
>
> Jobmanage.log:
>
> 2017-01-14 09:24:35,584 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - ------------------------------------------------------------
> --------------------
> 2017-01-14 09:24:35,585 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  Starting YARN ApplicationMaster / ResourceManager /
> JobManager (Version: 1.2.0, Rev:82b1079, Date:04.01.2017 @ 17:38:23 CST)
> 2017-01-14 09:24:35,585 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  Current user: admin
> 2017-01-14 09:24:35,585 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation -
> 1.8/25.112-b15
> 2017-01-14 09:24:35,585 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  Maximum heap size: 406 MiBytes
> 2017-01-14 09:24:35,585 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  JAVA_HOME: /opt/huawei/Bigdata/jdk1.8.0_112/
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  Hadoop version: 2.7.2
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  JVM Options:
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -     -Xmx424M
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -     -Dlog.file=/srv/BigData/hadoop/data1/nm/containerlogs/
> application_1483499303549_0043/container_1483499303549_
> 0043_01_000001/jobmanager.log
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -     -Dlogback.configurationFile=file:logback.xml
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -     -Dlog4j.configuration=file:log4j.properties
> 2017-01-14 09:24:35,587 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            -  Program Arguments: (none)
> 2017-01-14 09:24:35,589 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - ------------------------------------------------------------
> --------------------
> 2017-01-14 09:24:35,589 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Registered UNIX signal handlers for [TERM, HUP, INT]
> 2017-01-14 09:24:35,591 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - remoteKeytabPrincipal obtained admin
> 2017-01-14 09:24:35,592 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - YARN daemon is running as: admin Yarn client user obtainer:
> admin@HADOOP.COM
> 2017-01-14 09:24:35,596 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Loading config from directory /srv/BigData/hadoop/data1/nm/
> localdir/usercache/admin/appcache/application_
> 1483499303549_0043/container_1483499303549_0043_01_000001
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: jobmanager.rpc.address,
> 9.96.101.32
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: jobmanager.rpc.port, 6123
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: jobmanager.heap.mb, 256
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: taskmanager.heap.mb, 512
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: taskmanager.numberOfTaskSlots, 1
> 2017-01-14 09:24:35,598 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: taskmanager.memory.preallocate,
> false
> 2017-01-14 09:24:35,599 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: parallelism.default, 1
> 2017-01-14 09:24:35,599 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: jobmanager.web.port, 8081
> 2017-01-14 09:24:35,599 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: security.keytab,
> /home/demo/flink/release/flink-1.2.0/keytab/user.keytab
> 2017-01-14 09:24:35,599 INFO  org.apache.flink.configuration.GlobalConfiguration
>           - Loading configuration property: security.principal, admin
> 2017-01-14 09:24:35,608 INFO  org.apache.flink.runtime.security.JaasConfiguration
>          - Initializing JAAS configuration instance. Parameters:
> /srv/BigData/hadoop/data1/nm/localdir/usercache/admin/
> appcache/application_1483499303549_0043/container_
> 1483499303549_0043_01_000001/krb5.keytab, admin
> 2017-01-14 09:24:35,609 INFO  org.apache.flink.runtime.security.SecurityUtils
>              - SASL client auth for ZK will be disabled
> 2017-01-14 09:24:35,824 INFO  org.apache.hadoop.security.UserGroupInformation
>              - Login successful for user admin using keytab file
> /srv/BigData/hadoop/data1/nm/localdir/usercache/admin/
> appcache/application_1483499303549_0043/container_
> 1483499303549_0043_01_000001/krb5.keytab
> 2017-01-14 09:24:35,825 INFO  org.apache.flink.runtime.security.SecurityUtils
>              - Hadoop user set to admin@HADOOP.COM (auth:KERBEROS)
> 2017-01-14 09:24:35,936 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - YARN assigned hostname for application master: 9-96-101-251
> 2017-01-14 09:24:35,936 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - keytabPath: /srv/BigData/hadoop/data1/nm/
> localdir/usercache/admin/appcache/application_
> 1483499303549_0043/container_1483499303549_0043_01_000001/krb5.keytab
> 2017-01-14 09:24:35,938 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TaskManagers will be created with 1 task slots
> 2017-01-14 09:24:35,938 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TaskManagers will be started with container size 1024 MB, JVM
> heap size 424 MB, JVM direct memory limit 424 MB
> 2017-01-14 09:24:35,943 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Trying to start actor system at 9.96.101.251:38785
> 2017-01-14 09:24:36,336 INFO  akka.event.slf4j.Slf4jLogger
>                   - Slf4jLogger started
> 2017-01-14 09:24:36,438 INFO  Remoting
>                   - Starting remoting
> 2017-01-14 09:24:36,547 INFO  Remoting
>                   - Remoting started; listening on addresses
> :[akka.tcp://flink@9-96-101-251:38785]
> 2017-01-14 09:24:36,551 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Actor system started at 9.96.101.251:38785
> 2017-01-14 09:24:36,551 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Actor system bound to hostname 9-96-101-251.
> 2017-01-14 09:24:36,554 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Setting up resources for TaskManagers
> 2017-01-14 09:24:36,554 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TM:remoteKeytabPath obtained hdfs://hacluster/user/admin/.
> flink/application_1483499303549_0043/user.keytab
> 2017-01-14 09:24:36,555 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TM:remoteKeytabPrincipal obtained admin
> 2017-01-14 09:24:36,555 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TM:remoteYarnConfPath obtained null
> 2017-01-14 09:24:36,555 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - TM:remotekrb5Path obtained null
> 2017-01-14 09:24:36,932 WARN  org.apache.hadoop.util.NativeCodeLoader
>                    - Unable to load native-hadoop library for your
> platform... using builtin-java classes where applicable
> 2017-01-14 09:24:36,945 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory
>      - The short-circuit local reads feature cannot be used because
> libhadoop cannot be loaded.
> 2017-01-14 09:24:36,949 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Adding keytab hdfs://hacluster/user/admin/.flink/application_1483499303549_0043/user.keytab
> to the AM container local resource bucket
> 2017-01-14 09:24:37,085 INFO  org.apache.flink.yarn.Utils
>                  - Copying from file:/srv/BigData/hadoop/
> data1/nm/localdir/usercache/admin/appcache/application_
> 1483499303549_0043/container_1483499303549_0043_01_000001/
> e2943789-d80a-4abd-8ae9-2fc14cb1fd03-taskmanager-conf.yaml to
> hdfs://hacluster/user/admin/.flink/application_
> 1483499303549_0043/e2943789-d80a-4abd-8ae9-2fc14cb1fd03-
> taskmanager-conf.yaml
> 2017-01-14 09:24:37,258 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Prepared local resource for modified yaml: resource { scheme:
> "hdfs" host: "hacluster" port: -1 file: "/user/admin/.flink/
> application_1483499303549_0043/e2943789-d80a-4abd-8ae9-
> 2fc14cb1fd03-taskmanager-conf.yaml" } size: 878 timestamp: 1484357077250
> type: FILE visibility: APPLICATION
> 2017-01-14 09:24:37,265 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Creating container launch context for TaskManagers
> 2017-01-14 09:24:37,265 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Starting TaskManagers with command: $JAVA_HOME/bin/java
> -Xms424m -Xmx424m -XX:MaxDirectMemorySize=424m  -Dlog.file=<LOG_DIR>/taskmanager.log
> -Dlogback.configurationFile=file:./logback.xml
> -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskManager
> --configDir . 1> <LOG_DIR>/taskmanager.out 2> <LOG_DIR>/taskmanager.err
> 2017-01-14 09:24:37,288 INFO  org.apache.flink.runtime.blob.BlobServer
>                   - Created BLOB server storage directory
> /tmp/blobStore-f91bad88-1473-4e86-b151-f93dffa58baa
> 2017-01-14 09:24:37,289 INFO  org.apache.flink.runtime.blob.BlobServer
>                   - Started BLOB server at 0.0.0.0:53972 - max concurrent
> requests: 50 - max backlog: 1000
> 2017-01-14 09:24:37,302 INFO  org.apache.flink.runtime.metrics.MetricRegistry
>              - No metrics reporter configured, no metrics will be
> exposed/reported.
> 2017-01-14 09:24:37,307 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - Starting JobManager Web Frontend
> 2017-01-14 09:24:37,310 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist
>          - Started memory archivist akka://flink/user/$a
> 2017-01-14 09:24:37,311 INFO  org.apache.flink.yarn.YarnJobManager
>                   - Starting JobManager at akka.tcp://flink@9-96-101-251:
> 38785/user/jobmanager.
> 2017-01-14 09:24:37,318 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils
>          - Determined location of JobManager log file:
> /srv/BigData/hadoop/data1/nm/containerlogs/application_
> 1483499303549_0043/container_1483499303549_0043_01_000001/jobmanager.log
> 2017-01-14 09:24:37,318 INFO  org.apache.flink.runtime.webmonitor.WebMonitorUtils
>          - Determined location of JobManager stdout file:
> /srv/BigData/hadoop/data1/nm/containerlogs/application_
> 1483499303549_0043/container_1483499303549_0043_01_000001/jobmanager.out
> 2017-01-14 09:24:37,318 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor
>        - Using directory /tmp/flink-web-c4991b46-e637-4207-80ed-caef4cf5702e
> for the web interface files
> 2017-01-14 09:24:37,364 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor
>        - Using directory /tmp/flink-web-f8b7b6d7-b066-48a1-9536-40a3cdf42778
> for web frontend JAR file uploads
> 2017-01-14 09:24:37,378 INFO  org.apache.flink.yarn.YarnJobManager
>                   - JobManager akka.tcp://flink@9-96-101-251:38785/user/jobmanager
> was granted leadership with leader session ID None.
> 2017-01-14 09:24:37,560 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor
>        - Web frontend listening at 0:0:0:0:0:0:0:0:35699
> 2017-01-14 09:24:37,561 INFO  org.apache.flink.runtime.webmonitor.WebRuntimeMonitor
>        - Starting with JobManager akka.tcp://flink@9-96-101-251:38785/user/jobmanager
> on port 35699
> 2017-01-14 09:24:37,561 INFO  org.apache.flink.runtime.webmonitor.JobManagerRetriever
>      - New leader reachable under akka://flink/user/jobmanager#-
> 640052308:null.
> 2017-01-14 09:24:37,568 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - YARN application tolerates 4 failed TaskManager containers
> before giving up
> 2017-01-14 09:24:37,571 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner
>            - YARN Application Master started
> 2017-01-14 09:24:37,579 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Initializing YARN resource master
> 2017-01-14 09:24:37,605 INFO  org.apache.hadoop.yarn.client.api.impl.
> ContainerManagementProtocolProxy  - yarn.client.max-cached-nodemanagers-proxies
> : 0
> 2017-01-14 09:24:37,606 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Registering Application Master with tracking url
> http://9-96-101-251:35699
> 2017-01-14 09:24:37,641 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Trying to associate with JobManager leader
> akka://flink/user/jobmanager#-640052308
> 2017-01-14 09:24:37,647 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Resource Manager associating with leading JobManager
> Actor[akka://flink/user/jobmanager#-640052308] - leader session null
> 2017-01-14 09:24:37,648 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Requesting new TaskManager container with 1024 megabytes
> memory. Pending requests: 1
> 2017-01-14 09:24:37,654 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Requesting new TaskManager container with 1024 megabytes
> memory. Pending requests: 2
> 2017-01-14 09:24:37,654 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Requesting new TaskManager container with 1024 megabytes
> memory. Pending requests: 3
> 2017-01-14 09:24:37,655 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Requesting new TaskManager container with 1024 megabytes
> memory. Pending requests: 4
> 2017-01-14 09:24:38,681 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl
>        - Received new token for : 9-96-101-177:26009
> 2017-01-14 09:24:38,692 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Received new container: container_1483499303549_0043_01_000002
> - Remaining pending container requests: 3
> 2017-01-14 09:24:38,693 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Launching TaskManager in container ContainerInLaunch @
> 1484357078692: Container: [ContainerId: container_1483499303549_0043_01_000002,
> NodeId: 9-96-101-177:26009, NodeHttpAddress: 9-96-101-177:26010, Resource:
> <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken,
> service: 9.96.101.177:26009 }, ] on host 9-96-101-177
> 2017-01-14 09:24:38,694 INFO  org.apache.hadoop.yarn.client.api.impl.
> ContainerManagementProtocolProxy  - Opening proxy : 9-96-101-177:26009
> 2017-01-14 09:24:39,189 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl
>        - Received new token for : 9-96-101-251:26009
> 2017-01-14 09:24:39,189 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl
>        - Received new token for : 9-96-101-32:26009
> 2017-01-14 09:24:39,189 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl
>        - Received new token for : 9-91-8-160:26009
> 2017-01-14 09:24:39,190 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Received new container: container_1483499303549_0043_01_000003
> - Remaining pending container requests: 2
> 2017-01-14 09:24:39,190 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Launching TaskManager in container ContainerInLaunch @
> 1484357079190: Container: [ContainerId: container_1483499303549_0043_01_000003,
> NodeId: 9-96-101-251:26009, NodeHttpAddress: 9-96-101-251:26010, Resource:
> <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken,
> service: 9.96.101.251:26009 }, ] on host 9-96-101-251
> 2017-01-14 09:24:39,190 INFO  org.apache.hadoop.yarn.client.api.impl.
> ContainerManagementProtocolProxy  - Opening proxy : 9-96-101-251:26009
> 2017-01-14 09:24:39,202 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Received new container: container_1483499303549_0043_01_000004
> - Remaining pending container requests: 1
> 2017-01-14 09:24:39,202 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Launching TaskManager in container ContainerInLaunch @
> 1484357079202: Container: [ContainerId: container_1483499303549_0043_01_000004,
> NodeId: 9-96-101-32:26009, NodeHttpAddress: 9-96-101-32:26010, Resource:
> <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken,
> service: 9.96.101.32:26009 }, ] on host 9-96-101-32
> 2017-01-14 09:24:39,202 INFO  org.apache.hadoop.yarn.client.api.impl.
> ContainerManagementProtocolProxy  - Opening proxy : 9-96-101-32:26009
> 2017-01-14 09:24:39,217 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Received new container: container_1483499303549_0043_01_000005
> - Remaining pending container requests: 0
> 2017-01-14 09:24:39,217 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - Launching TaskManager in container ContainerInLaunch @
> 1484357079217: Container: [ContainerId: container_1483499303549_0043_01_000005,
> NodeId: 9-91-8-160:26009, NodeHttpAddress: 9-91-8-160:26010, Resource:
> <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken,
> service: 9.91.8.160:26009 }, ] on host 9-91-8-160
> 2017-01-14 09:24:39,217 INFO  org.apache.hadoop.yarn.client.api.impl.
> ContainerManagementProtocolProxy  - Opening proxy : 9-91-8-160:26009
> 2017-01-14 09:24:43,348 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - TaskManager container_1483499303549_0043_01_000003 has
> started.
> 2017-01-14 09:24:43,350 INFO  org.apache.flink.runtime.instance.InstanceManager
>            - Registered TaskManager at 9-96-101-251
> (akka.tcp://flink@9-96-101-251:57010/user/taskmanager) as
> 49800ab8cfcd1a11e45084a48281df75. Current number of registered hosts is
> 1. Current number of alive task slots is 1.
> 2017-01-14 09:24:44,881 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - TaskManager container_1483499303549_0043_01_000002 has
> started.
> 2017-01-14 09:24:44,881 INFO  org.apache.flink.runtime.instance.InstanceManager
>            - Registered TaskManager at 9-96-101-177
> (akka.tcp://flink@9-96-101-177:35778/user/taskmanager) as
> 5d65baf1ec196cf3ac5bc43870156855. Current number of registered hosts is
> 2. Current number of alive task slots is 2.
> 2017-01-14 09:24:45,855 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - TaskManager container_1483499303549_0043_01_000004 has
> started.
> 2017-01-14 09:24:45,855 INFO  org.apache.flink.runtime.instance.InstanceManager
>            - Registered TaskManager at 9-96-101-32
> (akka.tcp://flink@9-96-101-32:58486/user/taskmanager) as
> 80de4a47fa60536b78ea052cbadec7ee. Current number of registered hosts is
> 3. Current number of alive task slots is 3.
> 2017-01-14 09:24:46,018 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
>               - TaskManager container_1483499303549_0043_01_000005 has
> started.
> 2017-01-14 09:24:46,018 INFO  org.apache.flink.runtime.instance.InstanceManager
>            - Registered TaskManager at 9-91-8-160
> (akka.tcp://flink@9-91-8-160:47548/user/taskmanager) as
> e9f2497d6223b2d704b3aced665a3c02. Current number of registered hosts is
> 4. Current number of alive task slots is 4.
> 2017-01-14 09:29:58,066 ERROR akka.remote.EndpointWriter
>                   - dropping message [class akka.actor.ActorSelectionMessage]
> for non-local recipient [Actor[akka.tcp://flink@9.96.101.251:38785/]]
> arriving at [akka.tcp://flink@9.96.101.251:38785] inbound addresses are
> [akka.tcp://flink@9-96-101-251:38785]
>
> Thanks!
>
> HuangWHWHW
> 2017/1/14
>
> -----邮件原件-----
> 发件人: Till Rohrmann [mailto:trohrmann@apache.org]
> 发送时间: 2017年1月13日 18:22
> 收件人: dev@flink.apache.org
> 抄送: user@flink.apache.org
> 主题: Re: some questions about submit flink job on flink-yarn
>
> Hi Huang,
>
> this seems to be very strange, because the JobManager’s actor system has
> bound to the address 9-96-101-177 instead of 9.96.101.177. It seems a if
> the . have been replaced by -.
>
> Could you maybe tell me which version of Flink you’re running and also
> share the complete JobManager log with us?
>
> I tested it with the latest 1.2 SNAPSHOT version and there it seemed to
> work.
>
> Cheers,
> Till
> ​
>
> On Fri, Jan 13, 2017 at 9:02 AM, huangwei (G) <huangwei111@huawei.com>
> wrote:
>
> > Dear All,
> >
> > I get an error in jobmanage.log following when I submit a flink job
> > (batch/WordCount.jar) by using command : "./bin/flink run -m
> > 9.96.101.177:39180 ./examples/batch/WordCount.jar".
> >
> > And the flink is on yarn cluster.
> >
> > Error in jobmanage.log :
> > 2017-01-13 15:28:27,402 ERROR akka.remote.EndpointWriter
> >                   - dropping message [class
> > akka.actor.ActorSelectionMessage] for non-local recipient
> > [Actor[akka.tcp://flink@9.96.101.177:39180/]]
> > arriving at [akka.tcp://flink@9.96.101.177:39180] inbound addresses
> > are [akka.tcp://flink@9-96-101-177:39180]
> >
> > However, It is success when I use flink web-ui to submit the job.
> >
> > How to solve this problem?
> >
> > And otherwise, when I started the flink on yarn, the
> > jobmanage.rpc.port and the web port both were changed to 39180 and 57724.
> > The configuration following in flink-conf.yaml is just as default :
> >
> > jobmanager.rpc.port: 6123
> >
> > and
> >
> > jobmanager.web.port: 8081
> >
> > I started the flink on yarn using command : "./bin/yarn-session.sh -n 4".
> >
> > Why were the ports changed to 39180 and 57724?
> >
> > Many thanks if there is any help!
> >
> > HuangWHWHW
> > 2017.1.13
> >
>

Mime
View raw message