accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser" <>
Subject Re: Review Request 29386: ACCUMULO-2815 Client authentication via Kerberos
Date Wed, 31 Dec 2014 17:05:49 GMT

> On Dec. 30, 2014, 7:39 p.m., kturner wrote:
> > core/src/main/java/org/apache/accumulo/core/client/security/tokens/,
line 46
> > <>
> >
> >     This is adding UserGroupInformation to Accumulo API.  Is that a stable hadoop
API?  Does it have to be in API, why not just principal?
> Josh Elser wrote:
>     UGI is marked as LimitedPrivate and Evolving, but I have no qualms against including
it into our API. The problem with accepting a principal alone, we would then have to duplicate
the login capabilities that UGI already provides (e.g performing a login via password or keytab).
By accepting a UGI directly, we don't have to "own" this logic. It's my opinion that it's
easiest to just use UGI directly, but could see an argument for "transparently" using it behind
the scenes if you think that would be better.
> kturner wrote:
>     > I have no qualms against including it into our API
>     Why do you think its ok to ignore the API marking?  Shouldnt we honor whats marked
on the assumption that someone in Hadoop land will think they are free to make breaking changes
based on whats makred?
>     > but could see an argument for "transparently" using it behind the scenes if
you think that would be better.
>     From an implementation perspective, I don't know whats best.  I don't know what the
benefits of using UGI are.  What are the problems with not using it?  Would not using it require
a complex facade class?  Would not using it cause interoperability problems with others using
Kerberos in the hadoop ecosystem?  
>     If this object is really useful to users of Hadoop and we want to use it in our API,
I would advocate for opening a hadoop issue to change the stability guarantees.  I am uncertain
if we should use it before this change is made, but thats because I don't fully understand
the details.

bq. Why do you think its ok to ignore the API marking? Shouldnt we honor whats marked on the
assumption that someone in Hadoop land will think they are free to make breaking changes based
on whats makred?

The only difference between Stable and Evolving is that breaking changes could happen at minor
releases instead of just major releases (akin to semver). So, just to make a point, even using
only Stable classes will result in us having to worry about breaking changes.

It's possible that we could use a very thin shim around the most often used methods of UGI,
but I'm worried that would be insufficient. For example, most people use `UserGroupInformation.loginUserFromKeytab`
or they are have a cached ticket (from `kinit`'ing on the command line before invoking the
application) which we can get access to via `UserGroupInformation.getCurrentUser`. The other
approach is to use `UserGroupInformation.createProxyUser` which accepts a principal and "proxies"
another user on top of another UGI -- this is very useful for us as it eliminates the need
to do Authorization intersection as some "server-user" to maintain application security. The
server is running an Accumulo query for the user; however Accumulo "sees" the user, not the

Back to the point though: this actual assertion is a quick fail to make sure that the user
trying to use Kerberos is actually logged in (catch it and fail before we actually try to
send an RPC). We could push this farther down into the codebase -- inside ConnectorImpl maybe?
-- which would even moreso make this a "shell" AuthenticationToken. To clarify, your biggest
worry is the UGI argument in the constructor and not using it in the implementation, right?

- Josh

This is an automatically generated e-mail. To reply, visit:

On Dec. 31, 2014, 4:36 a.m., Josh Elser wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> -----------------------------------------------------------
> (Updated Dec. 31, 2014, 4:36 a.m.)
> Review request for accumulo.
> Bugs: ACCUMULO-2815
> Repository: accumulo
> Description
> -------
> ACCUMULO-2815 Initial support for Kerberos client authentication.
> Leverage SASL transport provided by Thrift which can speak GSSAPI, which Kerberos implements.
> * An Accumulo KerberosToken which is an AuthenticationToken to validate users.
> * Custom thrift processor and invocation handler to ensure server RPCs have a valid KRB
identity and Accumulo authentication.
> * A KerberosAuthenticator which extends ZKAuthenticator to support Kerberos identities
> * New ClientConf variables to use SASL transport and pass Kerberos server principal
> * Updated ClientOpts and Shell opts to transparently use a KerberosToken when SASL is
enabled (no extra client work).
> I believe this is the "bare minimum" for Kerberos support. They are also grossly lacking
in unit and integration tests. I believe that I might have somehow broken the client address
string in the server (I saw log messages with client: null, but I'm not sure if it's due to
these changes or not). A necessary limitation in the Thrift server used is that, like the
SSL transport, the SASL transport cannot presently be used with the TFramedTransport, which
means none of the [half]async thrift servers will function with this -- we're stuck with the
> Performed some contrived benchmarks on my laptop (while still using it myself) to get
at big-picture view of the performance impact against "normal" operation and Kerberos alone.
Each "run" was the duration to ingest 100M records using continuous-ingest, timed with `time`,
using 'real'.
> THsHaServer (our default), 6 runs:
> Avg: 10m7.273s (607.273s)
> Min: 9m43.395s
> Max: 10m52.715s
> TThreadPoolServer (no SASL), 5 runs:
> Avg: 11m16.254s (676.254s)
> Min: 10m30.987s
> Max: 12m24.192s
> TThreadPoolServer+SASL/GSSAPI (these changes), 6 runs:
> Avg: 13m17.187s (797.187s)
> Min: 10m52.997s
> Max: 16m0.975s
> The general takeway is that there's about 15% performance degredation in its initial
state which is in the realm of what I expected (~10%).
> Diffs
> -----
>   core/src/main/java/org/apache/accumulo/core/cli/ f6ea934 
>   core/src/main/java/org/apache/accumulo/core/client/ 6fe61a5

>   core/src/main/java/org/apache/accumulo/core/client/impl/ e75bec6

>   core/src/main/java/org/apache/accumulo/core/client/impl/ f481cc3

>   core/src/main/java/org/apache/accumulo/core/client/impl/ 6dc846f

>   core/src/main/java/org/apache/accumulo/core/client/impl/ 5da803b

>   core/src/main/java/org/apache/accumulo/core/client/security/tokens/
>   core/src/main/java/org/apache/accumulo/core/conf/ e054a5f 
>   core/src/main/java/org/apache/accumulo/core/rpc/ PRE-CREATION 
>   core/src/main/java/org/apache/accumulo/core/rpc/ PRE-CREATION

>   core/src/main/java/org/apache/accumulo/core/rpc/ 6eace77 
>   core/src/main/java/org/apache/accumulo/core/rpc/ 09bd6c4 
>   core/src/main/java/org/apache/accumulo/core/rpc/ PRE-CREATION

>   core/src/main/java/org/apache/accumulo/core/rpc/ PRE-CREATION

>   core/src/main/java/org/apache/accumulo/core/security/ 525a958 
>   core/src/test/java/org/apache/accumulo/core/cli/ ff49bc0 
>   core/src/test/java/org/apache/accumulo/core/client/ PRE-CREATION

>   core/src/test/java/org/apache/accumulo/core/conf/ 40be70f

>   core/src/test/java/org/apache/accumulo/core/rpc/ PRE-CREATION

>   proxy/src/main/java/org/apache/accumulo/proxy/ 4b048eb 
>   server/base/src/main/java/org/apache/accumulo/server/ 09ae4f4

>   server/base/src/main/java/org/apache/accumulo/server/init/ 046cfb5 
>   server/base/src/main/java/org/apache/accumulo/server/rpc/
>   server/base/src/main/java/org/apache/accumulo/server/rpc/
>   server/base/src/main/java/org/apache/accumulo/server/rpc/ 641c0bf

>   server/base/src/main/java/org/apache/accumulo/server/rpc/ PRE-CREATION

>   server/base/src/main/java/org/apache/accumulo/server/security/
>   server/base/src/main/java/org/apache/accumulo/server/security/ 29e4939

>   server/base/src/main/java/org/apache/accumulo/server/security/
>   server/base/src/main/java/org/apache/accumulo/server/security/handler/
>   server/base/src/main/java/org/apache/accumulo/server/thrift/
>   server/base/src/test/java/org/apache/accumulo/server/
>   server/base/src/test/java/org/apache/accumulo/server/rpc/
>   server/base/src/test/java/org/apache/accumulo/server/security/
>   server/gc/src/main/java/org/apache/accumulo/gc/ 93a9a49

>   server/gc/src/test/java/org/apache/accumulo/gc/
>   server/gc/src/test/java/org/apache/accumulo/gc/ 99558b8

>   server/gc/src/test/java/org/apache/accumulo/gc/replication/
>   server/master/src/main/java/org/apache/accumulo/master/ 12195fa 
>   server/tracer/src/main/java/org/apache/accumulo/tracer/ 7e33300 
>   server/tserver/src/main/java/org/apache/accumulo/tserver/ d5c1d2f

>   shell/src/main/java/org/apache/accumulo/shell/ 58308ff 
>   shell/src/main/java/org/apache/accumulo/shell/ 8167ef8 
>   shell/src/test/java/org/apache/accumulo/shell/ 0e72c8c 
>   shell/src/test/java/org/apache/accumulo/shell/ PRE-CREATION

>   test/src/main/java/org/apache/accumulo/test/functional/ eb84533 
>   test/src/main/java/org/apache/accumulo/test/performance/thrift/ 2ebc2e3

>   test/src/test/java/org/apache/accumulo/server/security/ fb71f5f

> Diff:
> Testing
> -------
> Ensure existing unit tests still function. Accumulo is functional and ran continuous
ingest multiple times using a client with only a Kerberos identity (no user/password provided).
Used MIT Kerberos with Apache Hadoop 2.6.0 and Apache ZooKeeper 3.4.5.
> Thanks,
> Josh Elser

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message