hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kan Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4343) Adding user and service-to-service authentication to Hadoop
Date Wed, 04 Mar 2009 20:23:56 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678899#action_12678899

Kan Zhang commented on HADOOP-4343:

More details on the delegation token design.

h4. Overview

After initial authentication to NN using Kerberos credentials, a user may obtain a delegation
token, which can be given to user jobs for subsequent authentication to NN as the user. The
token is in fact a secret key shared between the user and NN and should be protected when
passed over insecure channels. Anyone who gets it can impersonate the user on NN. Note that
*a user can only obtain new tokens after authenticating using Kerberos*.

When a user obtains a delegation token from NN, the user should tell NN who is the designated
token renewer. The designated renewer should authenticate to NN as itself when renewing the
token for the user. Renewing a token means extending the validity period of that token on
NN. No new token is issued. The old token continues to work. To let a Map/Reduce job use a
delegation token, the user needs to designate JT as the token renewer. All the Tasks of the
same job use the same token. JT is responsible for keeping the token valid till the job is
finished. After that, JT may optionally cancel the token. 

h4. Design

Here is the format of delegation token.

TokenID = {ownerID, renewerID, issueDate, maxDate}
TokenAuthenticator = HMAC(masterKey, TokenID)
Delegation Token = {TokenID, TokenAuthenticator}

NN chooses {{masterKey}} randomly and uses it to generate and verify delegation tokens. NN
keeps all active tokens in memory and associates each token with an {{expiryDate}}. If {{currentTime
> expiryDate}}, the token is considered expired and any client authentication request using
the token will be rejected. Expired tokens will be deleted from memory. A token is also deleted
from memory when the owner or the renewer cancels the token.

*Using Delegation Token* When a client (e.g., a Task) uses a delegation token to authenticate,
it first sends {{TokenID}} to NN (but never sends the associated {{TokenAuthenticator}} to
NN). {{TokenID}} identifies the token the client intends to use. Using {{TokenID}} and {{masterKey}},
NN can re-compute {{TokenAuthenticator}} and the token. NN checks if the token is valid. A
token is valid if and only if the token exists in memory and {{currentTime < expiryDate}}
associated with the token. If the token is valid, the client and NN will try to authenticate
each other using their own {{TokenAuthenticator}} as the secret key and [DIGEST-MD5|http://www.ietf.org/rfc/rfc2831.txt]
as the protocol. Note that during authentication, one party never reveals its own {{TokenAuthenticator}}
to the other party. If authentication fails (which means the client and NN do not share the
same {{TokenAuthenticator}}), they don't get to know each other's {{TokenAuthenticator}}.

*Token Renewal* Delegation tokens need to be renewed periodically to keep them valid. Suppose
JT is the designated renewer for a token. During renewal, JT authenticates to NN as JT. After
successful authentication, JT sends the token to be renewed to NN. NN verifies that 1) JT
is the renewer specified in {{TokenID}}, 2) {{TokenAuthenticator}} is correct, and 3) {{currentTime
< maxDate}} specified in {{TokenID}}. Upon successful verification, if the token exists
in memory, which means the token is currently valid, NN sets its new {{expiryDate}} to {{min(currentTime+renewPeriod,
maxDate)}}. If the token doesn't exist in memory, which indicates NN has restarted and therefore
lost memory of all previously stored tokens, NN adds the token to memory and sets its {{expiryDate}}
similarly. The latter case allows jobs to survive NN restarts. All JT has to do is to renew
all tokens with NN after NN restarts and before relaunching failed Tasks.

Note that the designated renewer can revive an expired (or canceled) token by simply renewing
it, if {{currentTime < maxDate}} specified in the token. This is because NN can't tell
the difference between a token that has expired (or has been canceled) and a token that is
not in the memory because NN restarted. Since only the designated renewer can revive an expired
(or canceled) token, this doesn't seem to be a security problem. An attacker who steals the
token can't renew or revive it.

The {{masterKey}} needs to be updated periodically. NN only needs to persist the {{masterKey}}
on disk, not the tokens.

> Adding user and service-to-service authentication to Hadoop
> -----------------------------------------------------------
>                 Key: HADOOP-4343
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4343
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
> Currently, Hadoop services do not authenticate users or other services. As a result,
Hadoop is subject to the following security risks.
> 1. A user can access an HDFS or M/R cluster as any other user. This makes it impossible
to enforce access control in an uncooperative environment. For example, file permission checking
on HDFS can be easily circumvented.
> 2. An attacker can masquerade as Hadoop services. For example, user code running on a
M/R cluster can register itself as a new TaskTracker.
> This JIRA is intended to be a tracking JIRA, where we discuss requirements, agree on
a general approach and identify subtasks. Detailed design and implementation are the subject
of those subtasks.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message