flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Knauf <konstantin.kn...@tngtech.com>
Subject Re: YARN High Availability
Date Tue, 05 Apr 2016 07:18:47 GMT
Hi Ufuk, Hi Stephan,

sorry for the late response Attached the client logs.

Cheers,

Konstantin

On 04.04.2016 21:20, Stephan Ewen wrote:
> This seems to the the critical part in the logs:
> 
> 2016-03-31 09:01:52,234 INFO  org.apache.flink.yarn.YarnJobManager      
>                    - Re-submitting 0 job graphs.
> 2016-03-31 09:02:51,182 INFO  org.apache.flink.yarn.YarnJobManager      
>                    - Stopping YARN JobManager with status FAILED and
> diagnostic Flink YARN Client requested shutdown.
> 
> The YarnJobManager starts up properly, but the Client never sends
> anything, shuts down at some point, and tears down the YARN cluster.
> 
> Client logs would help a lot there...
> 
> 
> 
> 
> On Sat, Apr 2, 2016 at 12:43 PM, Ufuk Celebi <uce@apache.org
> <mailto:uce@apache.org>> wrote:
> 
>     Hey Konstantin,
> 
>     That's weird. Can you please log the client output on DEBUG level and
>     provide that as well? I'm wondering whether the client uses a
>     different root path.
> 
>     The following seems to happen:
>     - you use ledf_recovery as the root namespace
>     - the task managers are connecting (they resolve the JM address via
>     ZooKeeper in this case as well, which means they correctly use the
>     same namespace)
>     - but the client, which started the YARN session, does not ever submit
>     the job to the cluster.
> 
>     – Ufuk
> 
>     On Thu, Mar 31, 2016 at 9:23 AM, Konstantin Knauf
>     <konstantin.knauf@tngtech.com <mailto:konstantin.knauf@tngtech.com>>
>     wrote:
>     > Hi everyone,
>     >
>     > we are running in some problems with multiple per-job yarn
>     sessions, too.
>     >
>     > When we are are starting a per-job yarn session (Flink 1.0, Hadoop
>     2.4)
>     > with recovery.zookeeper.path.root other than /flink, the yarn session
>     > starts but no job is submitted, and after 1 min or so the session
>     > crashes. I attached the jobmanager log.
>     >
>     > In Zookeeper the root-directory is created and child-nodes
>     >
>     > leaderlatch
>     > jobgraphs
>     >
>     > /flink does also exist, but does not have child nodes.
>     >
>     > Everything runs fine, with the default recovery.zookeeper.root.path.
>     >
>     > Does anyone have an idea, what is going on?
>     >
>     > Cheers,
>     >
>     > Konstnatin
>     >
>     >
>     > On 23.11.2015 17:00, Gwenhael Pasquiers wrote:
>     >> We are not yet using HA in our cluster instances.
>     >>
>     >> But yes, we will have to change the zookeeper.path.root J
>     >>
>     >>
>     >>
>     >> We package our jobs with their own config folder (we don’t rely on
>     >> flink’s config folder); we can put the maven project name into this
>     >> property then they will have different values J
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> *From:*Till Rohrmann [mailto:trohrmann@apache.org
>     <mailto:trohrmann@apache.org>]
>     >> *Sent:* lundi 23 novembre 2015 14:51
>     >> *To:* user@flink.apache.org <mailto:user@flink.apache.org>
>     >> *Subject:* Re: YARN High Availability
>     >>
>     >>
>     >>
>     >> The problem is the execution graph handle which is stored in
>     ZooKeeper.
>     >> You can manually remove it via the ZooKeeper shell by simply deleting
>     >> everything below your `recovery.zookeeper.path.root` ZNode. But you
>     >> should be sure that the cluster has been stopped before.
>     >>
>     >>
>     >>
>     >> Do you start the different clusters with different
>     >> `recovery.zookeeper.path.root` values? If not, then you should
>     run into
>     >> troubles when running multiple clusters at the same time. The
>     reason is
>     >> that then all clusters will think that they belong together.
>     >>
>     >>
>     >>
>     >> Cheers,
>     >>
>     >> Till
>     >>
>     >>
>     >>
>     >> On Mon, Nov 23, 2015 at 2:15 PM, Gwenhael Pasquiers
>     >> <gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>
>     >> <mailto:gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>>> wrote:
>     >>
>     >> OK, I understand.
>     >>
>     >> Maybe we are not really using flink as you intended. The way we are
>     >> using it, one cluster equals one job. That way we are sure to isolate
>     >> the different jobs as much as possible and in case of crashes /
>     bugs /
>     >> (etc) can completely kill one cluster without interfering with
>     the other
>     >> jobs.
>     >>
>     >> That future behavior seems good :-)
>     >>
>     >> Instead of the manual flink commands, is there to manually delete
>     those
>     >> old jobs before launching my job ? They probably are somewhere in
>     hdfs,
>     >> aren't they ?
>     >>
>     >> B.R.
>     >>
>     >>
>     >>
>     >> -----Original Message-----
>     >> From: Ufuk Celebi [mailto:uce@apache.org <mailto:uce@apache.org>
>     <mailto:uce@apache.org <mailto:uce@apache.org>>]
>     >> Sent: lundi 23 novembre 2015 12:12
>     >> To: user@flink.apache.org <mailto:user@flink.apache.org>
>     <mailto:user@flink.apache.org <mailto:user@flink.apache.org>>
>     >> Subject: Re: YARN High Availability
>     >>
>     >> Hey Gwenhaël,
>     >>
>     >> the restarting jobs are most likely old job submissions. They are not
>     >> cleaned up when you shut down the cluster, but only when they finish
>     >> (either regular finish or after cancelling).
>     >>
>     >> The workaround is to use the command line frontend:
>     >>
>     >> bin/flink cancel JOBID
>     >>
>     >> for each RESTARTING job. Sorry about the inconvenience!
>     >>
>     >> We are in an active discussion about addressing this. The future
>     >> behaviour will be that the startup or shutdown of a cluster cleans up
>     >> everything and an option to skip this step.
>     >>
>     >> The reasoning for the initial solution (not removing anything) was to
>     >> make sure that no jobs are deleted by accident. But it looks like
>     this
>     >> is more confusing than helpful.
>     >>
>     >> – Ufuk
>     >>
>     >>> On 23 Nov 2015, at 11:45, Gwenhael Pasquiers
>     >> <gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>
>     >> <mailto:gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>>> wrote:
>     >>>
>     >>> Hi again !
>     >>>
>     >>> On the same topic I'm still trying to start my streaming job
>     with HA.
>     >>> The HA part seems to be more or less OK (I killed the JobManager and
>     >> it came back), however I have an issue with the TaskManagers.
>     >>> I configured my job to have only one TaskManager and 1 slot that
>     does
>     >> [source=>map=>sink].
>     >>> The issue I'm encountering is that other instances of my job appear
>     >> and are in the RESTARTING status since there is only one task slot.
>     >>>
>     >>> Do you know of this, or have an idea of where to look in order to
>     >> understand what's happening ?
>     >>>
>     >>> B.R.
>     >>>
>     >>> Gwenhaël PASQUIERS
>     >>>
>     >>> -----Original Message-----
>     >>> From: Maximilian Michels [mailto:mxm@apache.org
>     <mailto:mxm@apache.org> <mailto:mxm@apache.org <mailto:mxm@apache.org>>]
>     >>> Sent: jeudi 19 novembre 2015 13:36
>     >>> To: user@flink.apache.org <mailto:user@flink.apache.org>
>     <mailto:user@flink.apache.org <mailto:user@flink.apache.org>>
>     >>> Subject: Re: YARN High Availability
>     >>>
>     >>> The docs have been updated.
>     >>>
>     >>> On Thu, Nov 19, 2015 at 12:36 PM, Ufuk Celebi <uce@apache.org
>     <mailto:uce@apache.org>
>     >> <mailto:uce@apache.org <mailto:uce@apache.org>>> wrote:
>     >>>> I’ve added a note about this to the docs and asked Max to trigger
a
>     >> new build of them.
>     >>>>
>     >>>> Regarding Aljoscha’s idea: I like it. It is essentially a shortcut
>     >> for configuring the root path.
>     >>>>
>     >>>> In any case, it is orthogonal to Till’s proposals. That one we
need
>     >> to address as well (see FLINK-2929). The motivation for the current
>     >> behaviour was to be rather defensive when removing state in order
>     to not
>     >> loose data accidentally. But it can be confusing, indeed.
>     >>>>
>     >>>> – Ufuk
>     >>>>
>     >>>>> On 19 Nov 2015, at 12:08, Till Rohrmann <trohrmann@apache.org
>     <mailto:trohrmann@apache.org>
>     >> <mailto:trohrmann@apache.org <mailto:trohrmann@apache.org>>>
wrote:
>     >>>>>
>     >>>>> You mean an additional start-up parameter for the
>     `start-cluster.sh`
>     >> script for the HA case? That could work.
>     >>>>>
>     >>>>> On Thu, Nov 19, 2015 at 11:54 AM, Aljoscha Krettek
>     >> <aljoscha@apache.org <mailto:aljoscha@apache.org>
>     <mailto:aljoscha@apache.org <mailto:aljoscha@apache.org>>> wrote:
>     >>>>> Maybe we could add a user parameter to specify a cluster name
that
>     >> is used to make the paths unique.
>     >>>>>
>     >>>>>
>     >>>>> On Thu, Nov 19, 2015, 11:24 Till Rohrmann
>     <trohrmann@apache.org <mailto:trohrmann@apache.org>
>     >> <mailto:trohrmann@apache.org <mailto:trohrmann@apache.org>>>
wrote:
>     >>>>> I agree that this would make the configuration easier. However,
it
>     >> entails also that the user has to retrieve the randomized path
>     from the
>     >> logs if he wants to restart jobs after the cluster has crashed or
>     >> intentionally restarted. Furthermore, the system won't be able to
>     clean
>     >> up old checkpoint and job handles in case that the cluster stop was
>     >> intentional.
>     >>>>>
>     >>>>> Thus, the question is how do we define the behaviour in order
to
>     >> retrieve handles and to clean up old handles so that ZooKeeper
>     won't be
>     >> cluttered with old handles?
>     >>>>>
>     >>>>> There are basically two modes:
>     >>>>>
>     >>>>> 1. Keep state handles when shutting down the cluster. Provide
>     a mean
>     >> to define a fixed path when starting the cluster and also a mean to
>     >> purge old state handles. Furthermore, add a shutdown mode where the
>     >> handles under the current path are directly removed. This mode would
>     >> guarantee to always have the state handles available if not
>     explicitly
>     >> told differently. However, the downside is that ZooKeeper will be
>     >> cluttered most certainly.
>     >>>>>
>     >>>>> 2. Remove the state handles when shutting down the cluster.
>     Provide
>     >> a shutdown mode where we keep the state handles. This will keep
>     >> ZooKeeper clean but will give you also the possibility to keep a
>     >> checkpoint around if necessary. However, the user is more likely
>     to lose
>     >> his state when shutting down the cluster.
>     >>>>>
>     >>>>> On Thu, Nov 19, 2015 at 10:55 AM, Robert Metzger
>     >> <rmetzger@apache.org <mailto:rmetzger@apache.org>
>     <mailto:rmetzger@apache.org <mailto:rmetzger@apache.org>>> wrote:
>     >>>>> I agree with Aljoscha. Many companies install Flink (and its
>     config)
>     >> in a central directory and users share that installation.
>     >>>>>
>     >>>>> On Thu, Nov 19, 2015 at 10:45 AM, Aljoscha Krettek
>     >> <aljoscha@apache.org <mailto:aljoscha@apache.org>
>     <mailto:aljoscha@apache.org <mailto:aljoscha@apache.org>>> wrote:
>     >>>>> I think we should find a way to randomize the paths where the
HA
>     >> stuff stores data. If users don’t realize that they store data in the
>     >> same paths this could lead to problems.
>     >>>>>
>     >>>>>> On 19 Nov 2015, at 08:50, Till Rohrmann <trohrmann@apache.org
>     <mailto:trohrmann@apache.org>
>     >> <mailto:trohrmann@apache.org <mailto:trohrmann@apache.org>>>
wrote:
>     >>>>>>
>     >>>>>> Hi Gwenhaël,
>     >>>>>>
>     >>>>>> good to hear that you could resolve the problem.
>     >>>>>>
>     >>>>>> When you run multiple HA flink jobs in the same cluster,
then you
>     >> don’t have to adjust the configuration of Flink. It should work
>     out of
>     >> the box.
>     >>>>>>
>     >>>>>> However, if you run multiple HA Flink cluster, then you
have
>     to set
>     >> for each cluster a distinct ZooKeeper root path via the option
>     >> recovery.zookeeper.path.root in the Flink configuraiton. This is
>     >> necessary because otherwise all JobManagers (the ones of the
>     different
>     >> clusters) will compete for a single leadership. Furthermore, all
>     >> TaskManagers will only see the one and only leader and connect to it.
>     >> The reason is that the TaskManagers will look up their leader at
>     a ZNode
>     >> below the ZooKeeper root path.
>     >>>>>>
>     >>>>>> If you have other questions then don’t hesitate asking
me.
>     >>>>>>
>     >>>>>> Cheers,
>     >>>>>> Till
>     >>>>>>
>     >>>>>>
>     >>>>>> On Wed, Nov 18, 2015 at 6:37 PM, Gwenhael Pasquiers
>     >> <gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>
>     >> <mailto:gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>>> wrote:
>     >>>>>> Nevermind,
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Looking at the logs I saw that it was having issues trying
to
>     >> connect to ZK.
>     >>>>>>
>     >>>>>> To make I short is had the wrong port.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> It is now starting.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Tomorrow I’ll try to kill some JobManagers *evil*.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Another question : if I have multiple HA flink jobs, are
>     there some
>     >> points to check in order to be sure that they won’t collide on
>     hdfs or ZK ?
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> B.R.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Gwenhaël PASQUIERS
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> From: Till Rohrmann [mailto:till.rohrmann@gmail.com
>     <mailto:till.rohrmann@gmail.com>
>     >> <mailto:till.rohrmann@gmail.com <mailto:till.rohrmann@gmail.com>>]
>     >>>>>> Sent: mercredi 18 novembre 2015 18:01
>     >>>>>> To: user@flink.apache.org <mailto:user@flink.apache.org>
>     <mailto:user@flink.apache.org <mailto:user@flink.apache.org>>
>     >>>>>> Subject: Re: YARN High Availability
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Hi Gwenhaël,
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> do you have access to the yarn logs?
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Cheers,
>     >>>>>>
>     >>>>>> Till
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> On Wed, Nov 18, 2015 at 5:55 PM, Gwenhael Pasquiers
>     >> <gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>
>     >> <mailto:gwenhael.pasquiers@ericsson.com
>     <mailto:gwenhael.pasquiers@ericsson.com>>> wrote:
>     >>>>>>
>     >>>>>> Hello,
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> We’re trying to set up high availability using an existing
>     >> zookeeper quorum already running in our Cloudera cluster.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> So, as per the doc we’ve changed the max attempt in yarn’s
config
>     >> as well as the flink.yaml.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> recovery.mode: zookeeper
>     >>>>>>
>     >>>>>> recovery.zookeeper.quorum: host1:3181,host2:3181,host3:3181
>     >>>>>>
>     >>>>>> state.backend: filesystem
>     >>>>>>
>     >>>>>> state.backend.fs.checkpointdir: hdfs:///flink/checkpoints
>     >>>>>>
>     >>>>>> recovery.zookeeper.storageDir: hdfs:///flink/recovery/
>     >>>>>>
>     >>>>>> yarn.application-attempts: 1000
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Everything is ok as long as recovery.mode is commented.
>     >>>>>>
>     >>>>>> As soon as I uncomment recovery.mode the deployment on yarn
is
>     >> stuck on :
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> “Deploying cluster, current state ACCEPTED”.
>     >>>>>>
>     >>>>>> “Deployment took more than 60 seconds….”
>     >>>>>>
>     >>>>>> Every second.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> And I have more than enough resources available on my yarn
>     cluster.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Do you have any idea of what could cause this, and/or what
logs I
>     >> should look for in order to understand ?
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> B.R.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Gwenhaël PASQUIERS
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>
>     >>>>>
>     >>>>>
>     >>>>>
>     >>>>
>     >>> <unwanted_jobs.jpg>
>     >>
>     >>
>     >>
>     >
>     > --
>     > Konstantin Knauf * konstantin.knauf@tngtech.com
>     <mailto:konstantin.knauf@tngtech.com> * +49-174-3413182
>     <tel:%2B49-174-3413182>
>     > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>     > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>     > Sitz: Unterföhring * Amtsgericht München * HRB 135082
> 
> 

-- 
Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Mime
View raw message