hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13590) Retry until TGT expires even if the UGI renewal thread encountered exception
Date Tue, 01 Nov 2016 05:43:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiao Chen updated HADOOP-13590:
-------------------------------
    Attachment: HADOOP-13590.08.patch

Thanks [~andrew.wang] for looking at this! Patch 8 should address all the comments:

bq. unit test
Now TestUGI tests the just the retry logic, and the new {{TestUGIWithMiniKdc}} tests the 'retries
at all'.

bq. Exponential back-off
Crossing your comment and [~stevel@apache.org]'s comment about using {{RetryPolicy}}, I changed
the code for retry-time calculation. Now it first calculates how many max retries could possibly
be needed, then creates a {{ExponentialBackoffRetry}} object and delegates the calculation
to it. This way we achieve the random interval + code reuse. The UGI code is (I think) harder
to read though.

> Retry until TGT expires even if the UGI renewal thread encountered exception
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-13590
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13590
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 2.8.0, 2.7.3, 2.6.4
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>         Attachments: HADOOP-13590.01.patch, HADOOP-13590.02.patch, HADOOP-13590.03.patch,
HADOOP-13590.04.patch, HADOOP-13590.05.patch, HADOOP-13590.06.patch, HADOOP-13590.07.patch,
HADOOP-13590.08.patch
>
>
> The UGI has a background thread to renew the tgt. On exception, it [terminates itself|https://github.com/apache/hadoop/blob/bee9f57f5ca9f037ade932c6fd01b0dad47a1296/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L1013-L1014]
> If something temporarily goes wrong that results in an IOE, even if it recovered no renewal
will be done and client will eventually fail to authenticate. We should retry with our best
effort, until tgt expires, in the hope that the error recovers before that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message