Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2E6AD18F88 for ; Tue, 19 Apr 2016 15:26:26 +0000 (UTC) Received: (qmail 22667 invoked by uid 500); 19 Apr 2016 15:26:25 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 22586 invoked by uid 500); 19 Apr 2016 15:26:25 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 22503 invoked by uid 99); 19 Apr 2016 15:26:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Apr 2016 15:26:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8DA122C1F61 for ; Tue, 19 Apr 2016 15:26:25 +0000 (UTC) Date: Tue, 19 Apr 2016 15:26:25 +0000 (UTC) From: "Junping Du (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4721) RM to try to auth with HDFS on startup, retry with max diagnostics on failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247975#comment-15247975 ] Junping Du commented on YARN-4721: ---------------------------------- bq. I'm going to propose "hadoop.security.kerberos.diagnostics", rather than an RM specific one. That way, the code can be replicated in the NMs and the HDFS components, with the same outcome: hadoop diagnostics +1. That sounds reasonable. Shall we split the JIRA into two - one for Hadoop project code and one for YARN? > RM to try to auth with HDFS on startup, retry with max diagnostics on failure > ----------------------------------------------------------------------------- > > Key: YARN-4721 > URL: https://issues.apache.org/jira/browse/YARN-4721 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Attachments: HADOOP-12889-001.patch > > > If the RM can't auth with HDFS, this can first surface during job submission, which can cause confusion about what's wrong and whose credentials are playing up. > Instead, the RM could try to talk to HDFS on launch, {{ls /}} should suffice. If it can't auth, it can then tell UGI to log more and retry. > I don't know what the policy should be if the RM can't auth to HDFS at this point. Certainly it can't currently accept work. But should it fail fast or keep going in the hope that the problem is in the KDC or NN and will fix itself without an RM restart? -- This message was sent by Atlassian JIRA (v6.3.4#6332)