flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: flink on yarn - Fatal error in AM: The ContainerLaunchContext was not set
Date Tue, 23 Aug 2016 14:25:38 GMT
Hi Mira,

Does using the fully-qualified hostname solve the issue?

Thanks,
Max

On Mon, Aug 22, 2016 at 1:38 PM, Miroslav Gajdoš
<miroslav.gajdos@firma.seznam.cz> wrote:
> Here is the log from yarn application - run on another cluster (this
> time cdh5.7.0, but with similar configuration). Check the hostnames; in
> configuration, there are aliases used and the difference from fqdn may
> be the cause, judging by the log (exception at line 87)...
>
> http://pastebin.com/iimPVbXB
>
> Thanks,
> Mira
>
>
>
> Maximilian Michels píše v Pá 19. 08. 2016 v 09:12 +0200:
>> Hi Mira,
>>
>> If I understood correctly, the log output should be for Flink 1.1.1.
>> However, there are classes present in the log which don't exist in
>> Flink 1.1.1, e.g. FlinkYarnClient. Could you please check if you
>> posted the correct log?
>>
>> Also, it would be good to have not only the client log but also the
>> log of the Flink Yarn application.
>>
>> Thanks,
>> Max
>>
>> On Thu, Aug 18, 2016 at 3:20 PM, Miroslav Gajdoš
>> <miroslav.gajdos@firma.seznam.cz> wrote:
>> >
>> > Tried to build it from source as well as use prebuilt binary
>> > release
>> > (v1.1.1), the last one produced this log output:
>> > http://pastebin.com/3L5Yhs9x
>> >
>> > Application in yarn still fails on "Fatal error in AM: The
>> > ContainerLaunchContext was not set".
>> >
>> > Mira
>> >
>> > Miroslav Gajdoš píše v Čt 18. 08. 2016 v 10:36 +0200:
>> > >
>> > > Hi Max,
>> > >
>> > > we are building it from sources and package it for debian. I can
>> > > try
>> > > to
>> > > use the binary release for hadoop 2.6.0.
>> > >
>> > > Regarding zookeeper, we do not share instances between dev and
>> > > production.
>> > >
>> > > Thanks,
>> > > Miroslav
>> > >
>> > > Maximilian Michels píše v Čt 18. 08. 2016 v 10:17 +0200:
>> > > >
>> > > >
>> > > > Hi Miroslav,
>> > > >
>> > > > From the logs it looks like you're using Flink version 1.0.x.
>> > > > The
>> > > > ContainerLaunchContext is always set by Flink. I'm wondering
>> > > > why
>> > > > this
>> > > > error can still occur. Are you using the default Hadoop version
>> > > > that
>> > > > comes with Flink (2.3.0)? You could try the Hadoop 2.6.0 build
>> > > > of
>> > > > Flink.
>> > > >
>> > > > Does your Dev cluster share the Zookeeper installation with the
>> > > > production cluster? I'm wondering because it receives incorrect
>> > > > leadership information although the leading JobManager seems to
>> > > > be
>> > > > attempting to register at the ApplicationMaster.
>> > > >
>> > > > Best,
>> > > > Max
>> > > >
>> > > > On Tue, Aug 16, 2016 at 1:28 PM, Miroslav Gajdoš
>> > > > <miroslav.gajdos@firma.seznam.cz> wrote:
>> > > > >
>> > > > >
>> > > > >
>> > > > > Log from yarn session runner is here:
>> > > > > http://pastebin.com/xW1W4HNP
>> > > > >
>> > > > > Our hadoop distribution is from cloudera, resourcenanager
>> > > > > version:
>> > > > > 2.6.0-cdh5.4.5, it runs in HA mode (there could be some
>> > > > > redirecting
>> > > > > on
>> > > > > accessing resourcemanager and/or namenode to active one).
>> > > > >
>> > > > > Ufuk Celebi píše v Út 16. 08. 2016 v 12:18 +0200:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > This could be a bug in Flink. Can you share the complete
>> > > > > > logs
>> > > > > > of
>> > > > > > the
>> > > > > > run? CC'ing Max who worked on the YARN client recently who
>> > > > > > might
>> > > > > > have
>> > > > > > an idea in which cases Flink would not set the context.
>> > > > > >
>> > > > > > On Tue, Aug 16, 2016 at 11:00 AM, Miroslav Gajdoš
>> > > > > > <miroslav.gajdos@firma.seznam.cz> wrote:
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Hi guys,
>> > > > > > >
>> > > > > > > i've run into some problems with flink/yarn. I try
to
>> > > > > > > deploy
>> > > > > > > flink
>> > > > > > > to
>> > > > > > > our cluster using /usr/lib/flink-scala2.10/bin/yarn-
>> > > > > > > session.sh,
>> > > > > > > but
>> > > > > > > the
>> > > > > > > yarn application does not even start, it goes from
>> > > > > > > accepted
>> > > > > > > to
>> > > > > > > finished/failed. Yarn info on resourcemanager looks
like
>> > > > > > > this:
>> > > > > > >
>> > > > > > > User:   wa-flink
>> > > > > > > Name:   Flink session with 3 TaskManagers
>> > > > > > > Ap
>> > > > > > > plication Type:         Apache Flink
>> > > > > > > Application Tags:
>> > > > > > > State:  FINISHED
>> > > > > > > FinalStatus:    FAILED
>> > > > > > > Started:        Mon Aug 15 18:02:42 +0200 2016
>> > > > > > > Elapsed:        16sec
>> > > > > > > Tracking URL:   History
>> > > > > > > Diagnostics:    Fatal error in AM: The
>> > > > > > > ContainerLaunchContext
>> > > > > > > was
>> > > > > > > not set.
>> > > > > > >
>> > > > > > > On dev cluster, applications deploys without problem,
>> > > > > > > this
>> > > > > > > happens
>> > > > > > > only
>> > > > > > > in production.
>> > > > > > >
>> > > > > > > What could be wrong?
>> > > > > > >
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > >
>> > > > > > > --
>> > > > > > > Miroslav Gajdoš
>> > > > > > > vývoj - webová analytika (Brno)
>> > > > > > > https://reporter.seznam.cz
>> > > > > > > miroslav.gajdos@firma.seznam.cz
>> > > > > > >
>> > > > > > >
>> > > > > --
>> > > > > Miroslav Gajdoš
>> > > > > vývoj - webová analytika (Brno)
>> > > > > https://reporter.seznam.cz
>> > > > > miroslav.gajdos@firma.seznam.cz
>> > --
>> > Miroslav Gajdoš
>> > vývoj - webová analytika (Brno)
>> > https://reporter.seznam.cz
>> > miroslav.gajdos@firma.seznam.cz
> --
> Miroslav Gajdoš
> vývoj - webová analytika (Brno)
> https://reporter.seznam.cz
> miroslav.gajdos@firma.seznam.cz

Mime
View raw message