hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent
Date Fri, 22 Apr 2011 00:57:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023060#comment-13023060

Todd Lipcon commented on HADOOP-7214:

Hi folks. It seems there are some concerns over this patch, but no one has actually vetoed
it. It seems like most of the concerns fall under the following three buckets:

*Concern 1:* "This command is only useful when a system is misconfigured."

Aaron provided the example of developer laptops connecting to a cluster. Another example we
see often is a case where Windows client nodes directly upload files to HDFS without going
through a "bastion box" or other proxy -- because of the differing group models between Unix
and Windows, it's not necessarily clear (especially to non-ops users) what groups they might
have on HDFS. Additionally, we often see people setting up Linux boxes with custom pam/nss
modules such as VAS which can have semantics that regular users may not expect.

So, those are several examples of where it's useful even in a well-configured system.

I also agree that it's *very* useful when a system is misconfigured. Unfortunately Hadoop
gets deployed in some environments with lax policies or inexperienced operations, and they
have no central account management system such as LDAP. Although I agree this is a bad setup,
we should endeavor to provide basic tools that help a user diagnose how Hadoop is interacting
with their bad setup. Of course, we should also try to educate users on best practices at
the same time :)

*Concern 2:* "I don't particularly care about this, since my cluster is well setup, so we
shouldn't add it."

In practice, I've seen many support issues (on the mailing lists as well as among our customers)
where having this tool would have simplified diagnosis and understanding of a problem. So,
though it may not be useful for every deployment, it's useful for some, and shouldn't be blocked
on this basis.

*Concern 3:* "Adding this command opens greater security exposure"

While I agree with Aaron that this seems to be picking nits, it seems we can address this
by adding a configuration allowing the command to be disabled. 

Are there any other concerns that I'm failing to summarize above? If not, I will proceed to
review the implementation of the patch and assume that no one is planning on vetoing.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt,
hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
> Since user -> groups resolution is done on the NN and JT machines, there should be
a way for users to determine what groups they're a member of from the NN's and JT's perspective.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message