hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: hadoop.job.ugi backwards compatibility
Date Mon, 13 Sep 2010 17:05:43 GMT
On Mon, Sep 13, 2010 at 9:31 AM, Owen O'Malley <omalley@apache.org> wrote:

> Moving the discussion over to the more appropriate mapreduce-dev.

This is not MR-specific, since the strangely named hadoop.job.ugi determines
HDFS permissions as well. +CC hdfs-dev... though I actually think this is an
issue that users will have interest in, which is why I posted to general
initially rather than a dev list.

> On Mon, Sep 13, 2010 at 9:08 AM, Todd Lipcon <todd@cloudera.com> wrote:
> > 1) Groups resolution happens on the server side, where it used to happen
> on
> > the client. Thus, all Hadoop users must exist on the NN/JT machines in
> order
> > for group mapping to succeed (or the user must write a custom group
> mapper).
> There is a plugin that performs the group lookup. See HADOOP-4656.
> There is no requirement for having the user accounts on the NN/JT
> although that is the easiest approach. It is not recommended that the
> users be allowed to login.

"or the user must write a custom group mapper" above refers to this plugin
capability. But I think most users do not want to spend the time to write
(or even setup) such a plugin beyond the default shell-based mapping

> I think it is important that turning security on and off doesn't
> drastically change the semantics or protocols. That will become much
> much harder to support downstream.
As someone who spends an awful lot of time doing downstream support of lots
of different clusters, I actually disagree. I believe the majority of users
do *not* plan on turning on security, so keeping things simpler for them is
worth a lot. In many of these clusters the users and the ops team and the
developers are all one and the same - it's not the multitenant "internal
service" model that we see at the larger installations like Yahoo or

> > 2) The hadoop.job.ugi parameter is ignored - instead the user has to use
> the
> > new UGI.createRemoteUser("foo").doAs() API, even in simple security.
> User code that counts on hadoop.job.ugi working will be horribly
> broken once you turn on security. Turning on and off security should
> not involve testing all of your applications. It is unfortunate that
> we ever used the configuration value as the user, but continuing to
> support it will make our user's code much much more brittle.

The assumption above is "once you turn on security" - but many users will
not and probably never will turn on security. Providing a transition plan
for one version is our usual policy here - I agree that long term we would
like to do away with this hack of a configuration parameter. Since it's not
hard to provide a backwards compatibility path with a deprecation warning
for one version, are you against it? Or just saying that on your particular
clusters you will choose not to take advantage of it?


Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message