Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6E86C200BC1 for ; Tue, 1 Nov 2016 17:57:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 6D5B0160ADA; Tue, 1 Nov 2016 16:57:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BE999160B0B for ; Tue, 1 Nov 2016 17:57:00 +0100 (CET) Received: (qmail 52961 invoked by uid 500); 1 Nov 2016 16:56:59 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 52579 invoked by uid 99); 1 Nov 2016 16:56:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2016 16:56:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 49F572C001E for ; Tue, 1 Nov 2016 16:56:59 +0000 (UTC) Date: Tue, 1 Nov 2016 16:56:59 +0000 (UTC) From: "Xiao Chen (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-13590) Retry until TGT expires even if the UGI renewal thread encountered exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 01 Nov 2016 16:57:01 -0000 [ https://issues.apache.org/jira/browse/HADOOP-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625994#comment-15625994 ] Xiao Chen commented on HADOOP-13590: ------------------------------------ Thanks [~stevel@apache.org] for the prompt review! Good point on {{getMaxTgtRenewalRetryCount}}, on a second thought I think it can be eliminated, so the retry policy goes to {{Int.MAX_VALUE}} and we simply check it against the end time. Currently it's only making sure we can create the RetryPolicy with correct maxRetries. Will do that in the next patch, and add comments. bq. Test-wise, I've added support for more backoff in tests that wait; look in LambdaTestUtils. Thanks for the good work, let me try replace the GenericTestUtil usage with it. bq. I also see that the code to set up a javax.security.auth.login.Configuration is surfacing again... See my [comment above|https://issues.apache.org/jira/browse/HADOOP-13590?focusedCommentId=15517201&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15517201], it's due to conflicting class name {{Configuration}} in hadoop and in javax. I guess we'll have to explicitly define one way or the other. :( Happy to wrap up a utility function to clean up all IBM hacks etc., I propose to create a separate jira to limit scope of this one. Please let me know if you feel otherwise. > Retry until TGT expires even if the UGI renewal thread encountered exception > ---------------------------------------------------------------------------- > > Key: HADOOP-13590 > URL: https://issues.apache.org/jira/browse/HADOOP-13590 > Project: Hadoop Common > Issue Type: Improvement > Components: security > Affects Versions: 2.8.0, 2.7.3, 2.6.4 > Reporter: Xiao Chen > Assignee: Xiao Chen > Attachments: HADOOP-13590.01.patch, HADOOP-13590.02.patch, HADOOP-13590.03.patch, HADOOP-13590.04.patch, HADOOP-13590.05.patch, HADOOP-13590.06.patch, HADOOP-13590.07.patch, HADOOP-13590.08.patch > > > The UGI has a background thread to renew the tgt. On exception, it [terminates itself|https://github.com/apache/hadoop/blob/bee9f57f5ca9f037ade932c6fd01b0dad47a1296/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L1013-L1014] > If something temporarily goes wrong that results in an IOE, even if it recovered no renewal will be done and client will eventually fail to authenticate. We should retry with our best effort, until tgt expires, in the hope that the error recovers before that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org