flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Flink on YARN: Cannot connect to JobManager
Date Sun, 15 Jan 2017 11:32:56 GMT
Hi Malte,

can it be that you’re trying to request more resources from your yarn
cluster than there are currently available? It depends a little bit on your
other settings but -yn 2 says that you request 2 TaskManagers.
Additionally, Flink will also allocate another container for the JobManager.
Per default, the TaskManager containers and the JobManager containers will
be started with 1 GB of memory. Thus, it needs at least 3 containers with 3
GB of memory. Could you check whether you have these resources available in
your YARN cluster?

If you have them available, then it indicates a faulty behaviour. Then it
would be great if you could share the aggregated YARN logs for the Flink
application with us (available after terminating the YARN application).
This would help with the further debugging of the problem.

Cheers,
Till
​

On Thu, Jan 12, 2017 at 4:13 PM, Malte Schwarzer <impressum@mieo.de> wrote:

> Hi all,
>
> I trying to run a Flink job on YARN via "$/bin/flink run -m yarn-cluster
> -yn 2 ..." with two nodes. But only one JobManager seems to be connected.
>
> Flinks hangs at this stage (look up message repeats every second):
>
> 017-01-11 15:12:13,653 DEBUG org.apache.flink.yarn.YarnClusterClient
>              - Looking up JobManager
> 2017-01-11 15:12:13,678 INFO org.apache.flink.yarn.YarnClusterClient
>              - TaskManager status (1/2)
> TaskManager status (1/2)
> 2017-01-11 15:12:13,929 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:14,197 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:14,451 DEBUG org.apache.hadoop.ipc.Client
>     - IPC Client (20529812) connection to ____/10.68.17
> .206:8032 from user sending #104
> 2017-01-11 15:12:14,452 DEBUG org.apache.hadoop.ipc.Client
>     - IPC Client (20529812) connection to ___:8032 from user got value #104
> 2017-01-11 15:12:14,452 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>                 - Call: getApplicationReport took 1ms
> 2017-01-11 15:12:14,462 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:14,745 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:15,014 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:15,276 DEBUG org.apache.flink.yarn.YarnClusterClient
>                 - Looking up JobManager
> 2017-01-11 15:12:15,322 DEBUG org.apache.hadoop.ipc.Client
>     - IPC Client (20529812) connection to ___:8020 from user: closed
> ...
>
> Any suggestions what can cause this?
>
> Standard MapReduce jobs work without any problem on YARN.
>
> Best regards,
> Malte
>

Mime
View raw message