accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3513) Ensure MapReduce functionality with Kerberos enabled
Date Wed, 28 Jan 2015 03:53:34 GMT


Josh Elser commented on ACCUMULO-3513:

bq. Well, no... we don't know that it does this already. We have no idea how it may have been
compromised internally

I'm not sure how we can make any reliable security model if we operate under the assumption
that YARN is insecure. We have to trust that the YARN task was correctly authenticated.

bq. Accumulo and the real client is trustworthy and is handling the client's credentials properly

Again. We have to assume YARN is doing the right thing.

bq.  it's not much of a stretch to just trust that it is acting on behalf of user X, simply
because it says so

That's the point I'm trying to make. That trust is a *huge* stretch. The code running inside
a YARN task is untrusted (unless you restrict job submission and vet the users externally
-- hit the users with a stick and tell them to behave). We should not be trusting this code
to act as the user that it should.

bq. The extra, expirable, shared secret is nice, but it doesn't get is much further than what
we can do without it, in my opinion

The shared secret is acting in place of the kerberos credentials because there is no credentials
available for use. It's not optional -- it's what acts as the authentication (password over
SASL instead of the kerberos identity). This is the best snippet I've read that describes

Kerberos is a 3-party protocol that solves the hard
problem of setting up an authenticated connection
between a client and a server that have never com-
municated with each other before (but they both reg-
istered with Kerberos KDC). Our delegation token is
also used to set up an authenticated connection be-
tween a client and a server (NameNode in this case).
The difference is that we assume the client and the
server had previously shared a secure connection (via
Kerberos), over which a delegation token can be ex-
changed. Hence, delegation token is essentially a
2-party protocol and much simpler than Kerberos.
However, we use Kerberos to bootstrap the initial
trust between a client and NameNode in order to ex-
change the delegation token for later use to set up
another secure connection between the client (actu-
ally job tasks launched on behalf of the client) and
the same NameNode

Please take some time to read [this overview on Hadoop security|].
It covers these points.

> Ensure MapReduce functionality with Kerberos enabled
> ----------------------------------------------------
>                 Key: ACCUMULO-3513
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 1.7.0
> I talked to [~devaraj] today about MapReduce support running on secure Hadoop to help
get a picture about what extra might be needed to make this work.
> Generally, in Hadoop and HBase, the client must have valid credentials to submit a job,
then the notion of delegation tokens is used by for further communication since the servers
do not have access to the client's sensitive information. A centralized service manages creation
of a delegation token which is a record which contains certain information (such as the submitting
user name) necessary to securely identify the holder of the delegation token.
> The general idea is that we would need to build support into the master to manage delegation
tokens to node managers to acquire and use to run jobs. Hadoop and HBase both contain code
which implements this general idea, but we will need to apply them Accumulo and verify that
it is M/R jobs still work on a kerberized environment.

This message was sent by Atlassian JIRA

View raw message