hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4721) RM to try to auth with HDFS on startup, retry with max diagnostics on failure
Date Thu, 28 Apr 2016 13:31:12 GMT

    [ https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262130#comment-15262130

Junping Du commented on YARN-4721:

Thanks [~stevel@apache.org] for updating the patch. 003 patch looks pretty good. 
Just one question: it seems KDiagBinding use builder pattern for all optional parameter in
construction. The only exception is withKeytabConfKey:
+    public void withKeytabConfKey(String key) {
+      this.keytabConfKey = key;
+    }
Shall we follow the same pattern of Builder?
If so, in ResourceManager.doSecureLogin(), 
+        String keytabFilename = conf.get(YarnConfiguration.RM_KEYTAB);
+        if (keytabFilename != null) {
+          binding.withKeytab(new File(keytabFilename));
+        }
Can we just simply do binding.withKeytabConfKey(YarnConfiguration.RM_KEYTAB) and handle null
case inside of withKeytabConfKey()?
Other looks good to me.

> RM to try to auth with HDFS on startup, retry with max diagnostics on failure
> -----------------------------------------------------------------------------
>                 Key: YARN-4721
>                 URL: https://issues.apache.org/jira/browse/YARN-4721
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-12289-002.patch, HADOOP-12289-003.patch, HADOOP-12889-001.patch
> If the RM can't auth with HDFS, this can first surface during job submission, which can
cause confusion about what's wrong and whose credentials are playing up.
> Instead, the RM could try to talk to HDFS on launch, {{ls /}} should suffice. If it can't
auth, it can then tell UGI to log more and retry.
> I don't know what the policy should be if the RM can't auth to HDFS at this point. Certainly
it can't currently accept work. But should it fail fast or keep going in the hope that the
problem is in the KDC or NN and will fix itself without an RM restart?

This message was sent by Atlassian JIRA

View raw message