accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
Date Wed, 04 Feb 2015 22:10:35 GMT


Josh Elser commented on ACCUMULO-3513:

bq. What user does the task run as? If the effective UID is the same as its parent, the filesystem
won't protect it.

Pretty sure I covered this already: the YARN tasks run as the user who submitted the job.
This requires that your user exists across your YARN node managers. Thus, it is not the same
effective UID, it's an entirely different one.

bq. If only the ResourceManager and the client could authenticate with Accumulo first

Why does the resource manager need to authenticate with Accumulo? The user needs to trust
that the YARN cluster they're talking to is "real" (and not some third party that is somehow
masquerading as a YARN cluster). If a user is just submitting their credentials to anyone
who listens, the problem is with that user and not something we can solve with Accumulo.

bq. MapReduce needs to avoid granting access to its credentials from an untrusted client (which
Accumulo does trust)

I'm not sure I understand what you mean here: No user code is being run with YARN's credentials.
YARN tasks could be run by users who don't have Accumulo "accounts", but just being able to
run a YARN job, doesn't mean they can authenticate with Accumulo (with a delegation token
that was obtained with real credentials).

> Ensure MapReduce functionality with Kerberos enabled
> ----------------------------------------------------
>                 Key: ACCUMULO-3513
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.7.0
>         Attachments: ACCUMULO-3513-design.pdf
> I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help
get a picture about what extra might be needed to make this work.
> Generally, in Hadoop and HBase, the client must have valid credentials to submit a job,
then the notion of delegation tokens is used by for further communication since the servers
do not have access to the client's sensitive information. A centralized service manages creation
of a delegation token which is a record which contains certain information (such as the submitting
user name) necessary to securely identify the holder of the delegation token.
> The general idea is that we would need to build support into the master to manage delegation
tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code
which implements this general idea, but we will need to apply them Accumulo and verify that
it is M/R jobs still work on a kerberized environment.

This message was sent by Atlassian JIRA

View raw message