hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4632) Replacing _HOST in RM_PRINCIPAL should not be the responsibility of the client code
Date Wed, 27 Jan 2016 19:31:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120025#comment-15120025

Daniel Templeton commented on YARN-4632:

Well, *that* was an entertaining rabbit hole.  [~daryn], it would appear that you are correct.
 To save others from tilting at this particular windmill in the future, I will document my
journey here.

When the client wants to run a YARN app against a secure HDFS cluster, it has to be pass that
application a delegation token (DT) that tells HDFS that the YARN is allowed to access HDFS
as the client.  That token gets serialized to bytes and passed over to the RM as part of the
launch context.  When the RM launches the app, the first thing it does is try to renew all
the tokens for the app.  The DT contains the principal of the service that is allowed to renew
it.  If that service principal doesn't match the name of the RM, the token renewal will fail,
causing the app launch to fail.

The token's renewer string is something the client provides to HDFS when it requests the DT.
 To find the RM's service name, the client typically looks up the {{YarnConfiguration.RM_PRINCIPAL}}
in the conf.  That configuration will typically list the service's hostname as {{_HOST}} to
allow the hostname to be replaced in the event of HA YARN.  That {{_HOST}} placeholder has
to be replaced by the RM's hostname by the client.  The point of this JIRA was to try to move
that hostname replacement out of the client's hands.

The problem is that the renewer string that the client sets is stashed away deep inside the
token, so once it's set, it's hard to unset.  It is theoretically possible to create a new
token based on the original token, except fixing the {{_HOST}} string in the renewer, but
the bad news is that creating a new HDFS token can only be done by a service that has access
to HDFS's delegation secret manager service, which is not exposed outside HDFS.  The net result
is that the only code that can change the renewer principal is HDFS itself.  Since the token
creation also requires the client's principal, the client is the only one that can do it.

I now realize that it doesn't actually make any sense to have the renewer set anywhere other
than the client code.  The whole point of the operation is for the client to explicitly grant
access to the renewer to keep a token alive.  Anything other than the client being the one
to name the renewer breaks the security chain.  The only viable approach to  solving the problem
this JIRA set out to solve is to make the {{_HOST}} replacement as simple as possible.  See
YARN-4629.  It's not a complete solution, but it's all we've got.

> Replacing _HOST in RM_PRINCIPAL should not be the responsibility of the client code
> -----------------------------------------------------------------------------------
>                 Key: YARN-4632
>                 URL: https://issues.apache.org/jira/browse/YARN-4632
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: api, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>            Priority: Critical
> It is currently the client's responsibility to call {{SecurityUtil.getServerPrincipal()}}
to replace the _HOST placeholder in any principal name used for a delegation token.  This
is a non-optional operation and should not be pushed onto the client.
> All client apps that followed the distributed shell as the canonical example failed to
do the replacement because distributed shell fails to do the replacement.  (See YARN-4629.)
 Rather than fixing the whole world, since the whole world use distributed shell as a model,
let's move the operation into YARN where it belongs.

This message was sent by Atlassian JIRA

View raw message