flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Nowojski <pi...@data-artisans.com>
Subject Re: Docker-Flink Project: TaskManagers can't talk to JobManager if they are on different nodes
Date Mon, 06 Nov 2017 08:22:56 GMT
Till, is there somewhere a list of ports that need to exposed that’s more up to date compared
to docker-flunk README?

Piotrek

> On 3 Nov 2017, at 10:23, Vergilio, Thalita <t.vergilio4822@student.leedsbeckett.ac.uk>
wrote:
> 
> Just an update: by changing the JOB_MANAGER_RPC_ADDRESS to the public IP of the JobManager
and exposing port 6123 as {{PUBLIC_IP}}:6123:6123, I manged to get the TaskManagers from different
nodes and even different subnets to talk to the JobManager.
> 
> This is how I created the services:
> 
> docker network create -d overlay overlay
> 
> docker service create --name jobmanager --env JOB_MANAGER_RPC_ADDRESS={{PUBLIC_IP}} 
-p 8081:8081 -p{{PUBLIC_IP}}:6123:6123 -p 48081:48081 -p 6124:6124 -p 6125:6125 --network
overlay --constraint 'node.hostname == ubuntu-swarm-manager' flink jobmanager
> 
> docker service create --name taskmanager --env JOB_MANAGER_RPC_ADDRESS={{PUBLIC_IP}}
 -p 6121:6121 -p 6122:6122  --network overlay --constraint 'node.hostname != ubuntu-swarm-manager'
flink taskmanager
> 
> However, I am still encountering errors further down the line. When I submit a job using
the Web UI, it fails because the JobManager can't talk to the TaskManager on port 35033. I
presume this is the taskmanager.data.port, which needs to be set to a range and this range
exposed when I create the service?
> 
> Are there any other ports that I need to open at service creation time?
> 
> Connecting the channel failed: Connecting to remote task manager + '/{{IP_ADDRESS_OF_MANAGER}}:35033'
has failed. This might indicate that the remote task manager has been lost.
> 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:196)
> 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:131)
> 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:83)
> 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:59)
> 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:112)
> 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:433)
> 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:455)
> 	at org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:91)
> 	at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:213)
> 	at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> 	at java.lang.Thread.run(Thread.java:748)
> 
> 
> From: Piotr Nowojski <piotr@data-artisans.com>
> Sent: 02 November 2017 14:26:32
> To: Vergilio, Thalita
> Cc: user@flink.apache.org
> Subject: Re: Docker-Flink Project: TaskManagers can't talk to JobManager if they are
on different nodes
>  
> Did you try to expose required ports that are listed in the README when starting the
containers?
> 
> https://github.com/apache/flink/tree/master/flink-contrib/docker-flink <https://github.com/apache/flink/tree/master/flink-contrib/docker-flink>
> Ports:
> • The Web Client is on port 48081
> • JobManager RPC port 6123 (default, not exposed to host)
> • TaskManagers RPC port 6122 (default, not exposed to host)
> • TaskManagers Data port 6121 (default, not exposed to host)
> 
> Piotrek
> 
>> On 2 Nov 2017, at 14:44, javalass <t.vergilio4822@student.leedsbeckett.ac.uk <mailto:t.vergilio4822@student.leedsbeckett.ac.uk>>
wrote:
>> 
>> I am using the Docker-Flink project in:
>> https://github.com/apache/flink/tree/master/flink-contrib/docker-flink <https://github.com/apache/flink/tree/master/flink-contrib/docker-flink>

>> 
>> I am creating the services with the following commands:
>> docker network create -d overlay overlay
>> docker service create --name jobmanager --env
>> JOB_MANAGER_RPC_ADDRESS=jobmanager -p 8081:8081 --network overlay
>> --constraint 'node.hostname == ubuntu-swarm-manager' flink jobmanager
>> docker service create --name taskmanager --env
>> JOB_MANAGER_RPC_ADDRESS=jobmanager --network overlay --constraint
>> 'node.hostname != ubuntu-swarm-manager' flink taskmanager
>> 
>> I wonder if there's any configuration I'm missing. This is the error I get:
>> - Trying to register at JobManager akka.tcp://flink@jobmanager:6123/ <applewebdata://6E2D9A1F-DD08-4DAE-9CD1-83B4648F20D6>
 
>> user/jobmanager (attempt 4, timeout: 4000 milliseconds)
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
> 
> To view the terms under which this email is distributed, please go to:- 
> http://disclaimer.leedsbeckett.ac.uk/disclaimer/disclaimer.html <http://disclaimer.leedsbeckett.ac.uk/disclaimer/disclaimer.html>

Mime
View raw message