flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eron Wright <eronwri...@gmail.com>
Subject Re: kerberos yarn - failure in long running streaming application
Date Mon, 14 Aug 2017 16:30:44 GMT
It sounds to me that the TGT is expiring (usually after 12 hours).   This
shouldn't happen in the keytab scenario because of a background thread
provided by Hadoop that periodically performs a re-login using the keytab.
  More details on the Hadoop internals here:
https://stackoverflow.com/a/34691071/3026310

To help narrow down the issue:
1. please share the stack trace (and, does the error occur on Job Manager
or on Task Manager?)
2. is kinit being called on the client prior to calling `flink run`?  (just
curious)
3. are you willing to share the Flink logs?

I'm happy to help if you prefer to share the the logs privately.

-Eron

On Mon, Aug 14, 2017 at 12:32 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. security.kerberos.login.contexts: Client,KafkaClien
>
> Just curious: there is missing 't' at the end of the above line.
>
> Maybe a typo when composing the email ?
>
> On Sun, Aug 13, 2017 at 11:15 PM, Prabhu V <vprabhu@gmail.com> wrote:
>
>> Hi,
>>
>> I am running Flink-1.3.2 on yarn (Cloudera 2.6.0-cdh5.7.6). The
>> application stream data from kafka, groups by key, creates a session window
>> and writes to HDFS using a rich window function in the "window.apply"
>> method.
>>
>> The rich window function creates the sequence file thus
>>
>> SequenceFile.createWriter(
>>                 conf,
>>                 new Option[] {
>>                         Writer.file(new Path("flink-output/" + filePath)),
>>                         Writer.compression(CompressionType.BLOCK,
>>                                 new DefaultCodec()),
>>                         Writer.keyClass(BytesWritable.class),
>>                         Writer.valueClass(BytesWritable.class) })
>>
>> The "conf" is created in the "open" method thus
>>
>> conf = HadoopFileSystem.getHadoopConfiguration();
>>         for (Map.Entry<String, String> entry :
>> parameters.toMap().entrySet()) {
>>             conf.set(entry.getKey(), entry.getValue());
>>         }
>>
>> where parameters is the flink.configuration.Configuration object that is
>> an argument to the open method
>>
>> The applications runs for about 10 hours before it fails with kerberos
>> error "Caused by: javax.security.sasl.SaslException: GSS initiate failed
>> [Caused by GSSException: No valid credentials provided (Mechanism level:
>> Failed to find any Kerberos tgt)]"
>>
>> The flink-conf.yaml has the following properties set.
>> security.kerberos.login.keytab: <keytab location>
>> security.kerberos.login.principal:<principal>
>> security.kerberos.login.contexts: Client,KafkaClien
>>
>> Any help would be appreciated.
>>
>>
>> Thanks,
>> Prabhu
>>
>
>

Mime
View raw message