flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prabhu V <vpra...@gmail.com>
Subject Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days
Date Thu, 17 Aug 2017 22:13:49 GMT
+1 on the 7 day expiry explanation,

This is most likely the cause.

I faced the 7 day expiry issue with a previous version of flink that dint
support keytabs, I am currently running flink-1.3 with keytabs (it has been
going okay for 2 days now), I will update after the 7 day mark.

Thanks,
Prabhu

On Thu, Aug 17, 2017 at 11:06 AM, Eron Wright <eronwright@gmail.com> wrote:

> Raja,
> According to those configuration values, the delegation token would be
> automatically renewed every 24 hours, then expire entirely after 7 days.
> You say that the job ran without issue for 'a few days'.  Can we conclude
> that the job hit the 7-day DT expiration?
>
> Flink supports the use of Kerberos keytabs as an alternative to delegation
> tokens for exactly this reason, that delegation tokens eventually expire
> and so aren't useful to a long-running program.   Consider making use of
> keytabs here.
>
> Hope this helps!
> -Eron
>
>
> On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> I think this needs to be done by the admin.
>>
>> On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <
>> Raja.Aravapalli@target.com> wrote:
>>
>>>
>>>
>>> I don’t have access to the site.xml files, it is controlled by a support
>>> team.
>>>
>>>
>>>
>>> Does flink has any configuration settings or api’s thru which we can
>>> control this ?
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>> *From: *Ted Yu <yuzhihong@gmail.com>
>>> *Date: *Thursday, August 17, 2017 at 11:07 AM
>>> *To: *Raja Aravapalli <Raja.Aravapalli@target.com>
>>> *Cc: *"user@flink.apache.org" <user@flink.apache.org>
>>> *Subject: *Re: [EXTERNAL] Re: Fink application failing with kerberos
>>> issue after running successfully without any issues for few days
>>>
>>>
>>>
>>> Can you try shortening renewal interval to something like 28800000 ?
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>> Hi Ted,
>>>
>>>
>>>
>>> Below is what I see in the environment:
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.max-lifetime:          *604800000*
>>>
>>> dfs.namenode.delegation.token.renew-interval:      *86400000*
>>>
>>>
>>>
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>> *From: *Ted Yu <yuzhihong@gmail.com>
>>> *Date: *Thursday, August 17, 2017 at 10:46 AM
>>> *To: *Raja Aravapalli <Raja.Aravapalli@target.com>
>>> *Cc: *"user@flink.apache.org" <user@flink.apache.org>
>>> *Subject: *[EXTERNAL] Re: Fink application failing with kerberos issue
>>> after running successfully without any issues for few days
>>>
>>>
>>>
>>> What are the values for the following parameters ?
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.max-lifetime
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.renew-interval
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>> Hi Ted,
>>>
>>>
>>>
>>> Find below the configuration I see in yarn-site.xml
>>>
>>>
>>>
>>> <property>
>>>
>>>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>>>
>>>       <value>true</value>
>>>
>>>     </property>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>>
>>>
>>> *From: *Ted Yu <yuzhihong@gmail.com>
>>> *Date: *Wednesday, August 16, 2017 at 9:05 PM
>>> *To: *Raja Aravapalli <Raja.Aravapalli@target.com>
>>> *Cc: *"user@flink.apache.org" <user@flink.apache.org>
>>> *Subject: *[EXTERNAL] Re: hadoop
>>>
>>>
>>>
>>> Can you check the following config in yarn-site.xml ?
>>>
>>>
>>>
>>> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I triggered an flink yarn-session on a running Hadoop cluster… and
>>> triggering streaming application on that.
>>>
>>>
>>>
>>> But, I see after few days of running without any issues, the flink
>>> application which is writing data to hdfs failing with below exception.
>>>
>>>
>>>
>>> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.secu
>>> rity.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN
>>> token xxxxxx for xxxxxx) can't be found in cache
>>>
>>>
>>>
>>>
>>>
>>> Can someone please help me how I can fix this. Thanks a lot.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Mime
View raw message