accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
Date Wed, 04 Feb 2015 22:04:35 GMT


Christopher Tubbs commented on ACCUMULO-3513:

{quote}Keytabs on disk should be protected by the filesystem. ...  A little C program ...
drops permissions ...{quote}

What user does the task run as? If the effective UID is the same as its parent, the filesystem
won't protect it.

{quote}... it's expected that the delegation token is protected from prying eyes ...{quote}

There seems to be a trade-off here, with competing goals. On the one hand, we need to make
sure Accumulo doesn't give up data to an untrusted middle-man. And, on the other hand, MapReduce
needs to avoid granting access to its credentials from an untrusted client (which Accumulo
*does* trust).

If only the ResourceManager *and* the client could authenticate with Accumulo first, then
we could carry information about both of these things in the token used to authenticate to
Accumulo in the actual task, then we could trust the middle-man (YARN task) *and* the client
to be able to receive the data from Accumulo.

> Ensure MapReduce functionality with Kerberos enabled
> ----------------------------------------------------
>                 Key: ACCUMULO-3513
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.7.0
>         Attachments: ACCUMULO-3513-design.pdf
> I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help
get a picture about what extra might be needed to make this work.
> Generally, in Hadoop and HBase, the client must have valid credentials to submit a job,
then the notion of delegation tokens is used by for further communication since the servers
do not have access to the client's sensitive information. A centralized service manages creation
of a delegation token which is a record which contains certain information (such as the submitting
user name) necessary to securely identify the holder of the delegation token.
> The general idea is that we would need to build support into the master to manage delegation
tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code
which implements this general idea, but we will need to apply them Accumulo and verify that
it is M/R jobs still work on a kerberized environment.

This message was sent by Atlassian JIRA

View raw message