Return-Path: Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: (qmail 29480 invoked from network); 7 Apr 2011 00:42:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Apr 2011 00:42:47 -0000 Received: (qmail 80164 invoked by uid 500); 7 Apr 2011 00:42:47 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 80135 invoked by uid 500); 7 Apr 2011 00:42:47 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 80122 invoked by uid 99); 7 Apr 2011 00:42:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Apr 2011 00:42:47 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Apr 2011 00:42:44 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 09811968C2 for ; Thu, 7 Apr 2011 00:42:06 +0000 (UTC) Date: Thu, 7 Apr 2011 00:42:06 +0000 (UTC) From: "Aaron T. Myers (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1658350729.39442.1302136926035.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1243264568.21599.1301502005765.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016632#comment-13016632 ] Aaron T. Myers commented on HADOOP-7214: ---------------------------------------- Allen, I've previously seen you describe Hadoop as "essentially a distributed, networked operating system." I agree with that assessment. Presently, this OS provides {{chown}}, {{chgrp}}, and {{chmod}}, all of which presuppose the existence of groups associated with a user. However, this OS doesn't provide a way for a user to find out what groups they're a member of from the OS's perspective. I'm surprised you're resistant to adding this functionality. It seems to me to be a simple deficiency. Detailed responses to your concerns are inline. bq. My conclusion is simple: if there is an easy fix for this, go for it. But if we're inventing a bunch of tool-age to support brokenness, I think it is bad to add the weight long term (yes, this includes RESTs, RPCs, etc, etc). You still haven't answered my question: which setup that I described above do you consider "broken" ? As I said previously, I can probably agree that having different users/groups on the NN vs the JT is indeed a misconfiguration, and we shouldn't concern ourselves with that scenario. But, do you also consider having different users/groups on the client machine vs the NN to be a misconfiguration? That seems like a perfectly reasonable setup to me, and one that we should support. bq. BTW, it is also perfectly reasonable to expect that companies that decide to have split naming services to provide ways to query that information on their own. Perhaps, but Hadoop also supports making the user -> group mapping service pluggable via the {{hadoop.security.group.mapping}} configuration parameter. Why should we require implementers of this to provide a way of querying this information on their own, through some other mechanism, rather than have Hadoop show it? When a Hadoop user gets a "permission denied" error from a Hadoop command, and wants to know what groups Hadoop thinks they belong to, they'll have to run "{{random-command-x}}" rather than something simple like "{{hadoop fs -groups}}". That only seems to make Hadoop harder to use. bq. There could be some potential security issues that we might be circumventing by providing that information out-of-band. Hadoop assumes that file system implementations are capable of associating files and directories with users and groups, as HDFS does. That's already part of the existing Hadoop commands. A user could presently determine what groups they're a member of by creating a file and then trying to {{chgrp}} it to different things. The set of inputs for which the {{chgrp}} succeeded would be the set of groups the user is a member of. Obviously, this isn't feasible for a normal user to do when they get a {{PermissionDeniedException}}, but it's perfectly reasonable for an attacker to do. My point is just that Hadoop isn't hiding this information as it stands. Hadoop makes decisions based on the groups a user belongs to, so we should make it easy for our users to find out what groups Hadoop thinks they belong to. bq. The other thing to keep in mind that going down the path of 'hadoop groups' is too limiting. If we are going to provide group information, why not also provide uid, username, etc. Showing the username seems reasonable to me, and in fact the patch I'm working on displays this. Hadoop doesn't make decisions based on one's UID, so why should we show that? bq. In the case of a kerberized environment, there is no guarantee that the TGT info matches what is actually executed on the compute nodes due to remapping... I don't follow this reasoning. Kerberos doesn't have any notion of groups. But, the first component of the Kerberos principal name is used as the username when the NN and JT determine a user's groups. I don't see how we need to account for anything differently with or without Kerberos support enabled. > Hadoop /usr/bin/groups equivalent > --------------------------------- > > Key: HADOOP-7214 > URL: https://issues.apache.org/jira/browse/HADOOP-7214 > Project: Hadoop Common > Issue Type: New Feature > Affects Versions: 0.23.0 > Reporter: Aaron T. Myers > Assignee: Aaron T. Myers > > Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira