Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7AB6D18707 for ; Sat, 6 Feb 2016 13:05:28 +0000 (UTC) Received: (qmail 16283 invoked by uid 500); 6 Feb 2016 13:05:22 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 16193 invoked by uid 500); 6 Feb 2016 13:05:22 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 16184 invoked by uid 99); 6 Feb 2016 13:05:22 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Feb 2016 13:05:22 +0000 Received: from mail-lf0-f53.google.com (mail-lf0-f53.google.com [209.85.215.53]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 16EE61A0015 for ; Sat, 6 Feb 2016 13:05:21 +0000 (UTC) Received: by mail-lf0-f53.google.com with SMTP id j78so72549489lfb.1 for ; Sat, 06 Feb 2016 05:05:21 -0800 (PST) X-Gm-Message-State: AG10YOT8BLA1comoejQJLBL+Y4aDOJg9Lfzfdf4m12Dr22Sznu8LMc7FPUC9hsVSRZt/eQcRXkIPFlbdaboDxWnp X-Received: by 10.25.81.144 with SMTP id f138mr8065137lfb.146.1454763920206; Sat, 06 Feb 2016 05:05:20 -0800 (PST) MIME-Version: 1.0 Received: by 10.25.167.73 with HTTP; Sat, 6 Feb 2016 05:05:00 -0800 (PST) In-Reply-To: References: From: Maximilian Michels Date: Sat, 6 Feb 2016 14:05:00 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Flink on YARN: Stuck on "Trying to register at JobManager" To: "user@flink.apache.org" Content-Type: text/plain; charset=UTF-8 Hi Pieter, Which version of Flink are you using? It appears you've created a Flink YARN cluster but you can't reach the JobManager afterwards. Cheers, Max On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete wrote: > Hi Robert, > > unfortunately there are no signs of what is going wrong in the logs. The > last log messages are about succesful registration of the TaskManagers. > > I'm also fairly sure it must be something in my VM that is causing this, > because when I start the yarn-session from a login node that is on the same > network as the hadoop cluster there are no problems registering with the > JobManager. I did also notice the following message in the local console: > > 12:30:27,173 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@145.100.41.13:41539]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: > connection timed out: /145.100.41.13:41539 > > I can ping the JobManager fine from with VM. Could there be some invalid or > missing configuration on my side? > > Cheers, > > Pieter > > > 2016-02-06 12:54 GMT+01:00 Robert Metzger : >> >> Hi, >> >> did you check the logs of the JobManager itself? Maybe it'll tell us >> already whats going on. >> >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete >> wrote: >>> >>> Hi Guys! >>> >>> Im attempting to run Flink on YARN, but I run into an issue. Im starting >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well until after >>> the JobManager web UI is started: >>> >>> JobManager web interface address >>> http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/ >>> Waiting until all TaskManagers have connected >>> 11:09:51,557 INFO org.apache.flink.yarn.ApplicationClient >>> - Notification about new leader address >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with session ID null. >>> No status updates from the YARN cluster received so far. Waiting ... >>> 11:09:51,578 INFO org.apache.flink.yarn.ApplicationClient >>> - Received address of new leader >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with session ID null. >>> 11:09:51,583 INFO org.apache.flink.yarn.ApplicationClient >>> - Disconnect from JobManager null. >>> 11:09:51,595 INFO org.apache.flink.yarn.ApplicationClient >>> - Trying to register at JobManager >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager. >>> No status updates from the YARN cluster received so far. Waiting ... >>> No status updates from the YARN cluster received so far. Waiting ... >>> >>> It then hangs on these last steps (trying to register, no status >>> updates..) >>> >>> Im sure there must be a problem on my side that is causing me not to be >>> able to register at the JobManager. What could cause such connection >>> problems? >>> >>> Any tips are very welcome :-) >>> >>> Cheers and have a good weekend! >>> >>> - Pieter >>> >>> >> >