hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5910) Support for multi-cluster delegation tokens
Date Fri, 20 Jan 2017 16:01:26 GMT

    [ https://issues.apache.org/jira/browse/YARN-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832014#comment-15832014
] 

Jason Lowe commented on YARN-5910:
----------------------------------

Thanks for updating the patch!

Nit: I think it should be more clear that the regex in the documentation is just an example
and not the default, e.g.: "This regex" s/b "For example the following regex".

DEFAULT_RM_DELEGATION_TOKEN_MAX_SIZE doesn't match yarn-default.xml.

It's confusing that the max size check is using capacity() but the error message uses position().

I'm curious on the reasoning for removing the assert for NEW state?

I was unable to reproduce the TestRMRestart and TestMRIntermediateDataEncryption failures
with the patch, but TestAppManager fails consistently for me with the patch applied and passes
consistently without.  Please investigate.

> Support for multi-cluster delegation tokens
> -------------------------------------------
>
>                 Key: YARN-5910
>                 URL: https://issues.apache.org/jira/browse/YARN-5910
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: security
>            Reporter: Clay B.
>            Assignee: Jian He
>            Priority: Minor
>         Attachments: YARN-5910.01.patch, YARN-5910.2.patch, YARN-5910.3.patch, YARN-5910.4.patch,
YARN-5910.5.patch, YARN-5910.6.patch
>
>
> As an administrator running many secure (kerberized) clusters, some which have peer clusters
managed by other teams, I am looking for a way to run jobs which may require services running
on other clusters. Particular cases where this rears itself are running something as core
as a distcp between two kerberized clusters (e.g. {{hadoop --config /home/user292/conf/ distcp
hdfs://LOCALCLUSTER/user/user292/test.out hdfs://REMOTECLUSTER/user/user292/test.out.result}}).
> Thanks to YARN-3021, once can run for a while but if the delegation token for the remote
cluster needs renewal the job will fail[1]. One can pre-configure their {{hdfs-site.xml}}
loaded by the YARN RM to know of all possible HDFSes available but that requires coordination
that is not always feasible, especially as a cluster's peers grow into the tens of clusters
or across management teams. Ideally, one could have core systems configured this way but jobs
could also specify their own handling of tokens and management when needed?
> [1]: Example stack trace when the RM is unaware of a remote service:
> ----------------
> {code}
> 2016-03-23 14:59:50,528 INFO org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
application_1458441356031_3317 found existing hdfs token Kind: HDFS_DELEGATION_TOKEN, Service:
ha-hdfs:REMOTECLUSTER, Ident: (HDFS_DELEGATION_TOKEN token
>  10927 for user292)
> 2016-03-23 14:59:50,557 WARN org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:REMOTECLUSTER,
Ident: (HDFS_DELEGATION_TOKEN token 10927 for user292)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:427)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:781)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:762)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Unable to map logical nameservice URI 'hdfs://REMOTECLUSTER'
to a NameNode. Local configuration does not have a failover proxy provider configured.
> at org.apache.hadoop.hdfs.DFSClient$Renewer.getNNProxy(DFSClient.java:1164)
> at org.apache.hadoop.hdfs.DFSClient$Renewer.renew(DFSClient.java:1128)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:516)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:513)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:511)
> at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:425)
> ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message