Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 10580 invoked from network); 19 Apr 2007 20:21:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Apr 2007 20:21:17 -0000 Received: (qmail 88470 invoked by uid 500); 19 Apr 2007 20:21:23 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 88441 invoked by uid 500); 19 Apr 2007 20:21:22 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 88428 invoked by uid 99); 19 Apr 2007 20:21:22 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Apr 2007 13:21:22 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of munncha@gmail.com designates 64.233.162.225 as permitted sender) Received: from [64.233.162.225] (HELO nz-out-0506.google.com) (64.233.162.225) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Apr 2007 13:21:15 -0700 Received: by nz-out-0506.google.com with SMTP id i1so540088nzh for ; Thu, 19 Apr 2007 13:20:54 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ALwKYWiEuhGzicLYTk+nWn0pKHvQ1sCWKYxy79dOEuAS1ssHcs7Kn3T7Da5mqx2uAYnCKMz2tHcFVBMYDY6nsnp7bKWbLr6FnOuAWWd7G8pyAZKrE6zcfgoKz+aOMw0KxcC9tDESFM6oFakqkVVSsX4gCgBwiAZAa5KfQvbIn0Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=MCMrPxvHA/BKdL3XYz+ows7dv37gWx2st8UbaZ43HFXvoC6RHrjOhzRrj+ofGIIFF03oPSOMPePKXoA4CoyiKgRWbq7882gLngVw4DxYhI72XAInr84dwSXCWEQhZiJruC7cOpH4ca427vGDir+mqkYFxCd01rEw3MCBtgIE1yM= Received: by 10.114.197.1 with SMTP id u1mr900835waf.1177014054236; Thu, 19 Apr 2007 13:20:54 -0700 (PDT) Received: by 10.114.133.7 with HTTP; Thu, 19 Apr 2007 13:20:54 -0700 (PDT) Message-ID: <9a4753250704191320j510dfe29ua52788b8a9935e0d@mail.gmail.com> Date: Thu, 19 Apr 2007 13:20:54 -0700 From: "Kurtis Heimerl" To: hadoop-dev@lucene.apache.org Subject: Re: Fwd: hadoop file permissions In-Reply-To: <9a4753250704191140n73fba1ecoc5d67c56b96f02ed@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_84109_17985382.1177014054172" References: <4626844D.4080201@apache.org> <9a4753250704181658p54f2335dwf09e764260b7bdf8@mail.gmail.com> <4627A5C3.5080603@apache.org> <9a4753250704191140n73fba1ecoc5d67c56b96f02ed@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_84109_17985382.1177014054172 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Some other notes/questions: On 4/19/07, Kurtis Heimerl wrote: > > On 4/19/07, Doug Cutting wrote: > > > > Kurtis Heimerl wrote: > > >> Yes, DFSClient will need to pass the user to the namenode. > > >> > > >> Perhaps the username should be put in the FileSystem's URI. So an > > HDFS > > >> URI would become hdfs://user@namenode:5555/foo/bar. URI's without a > > >> username would have "other" access (typically read-only). > > > > > > That's reasonable. I don't know how kerberos plays with that though. > > > > I chatted with Owen a bit yesterday about this and think it's better to > > keep the username in the config. A FileSystem is created given a URI > > and a Configuration. FileSystem's are currently cached, keyed on the > > URI's protocol and authority (host & port, typically). We should add > > the configuration to the cache key too, so that different FileSystem > > instances are used for different users. That permits FileSystem > > implementations to use arbitrary config properties in their ctor. > > > > I think we should be able to put a Kerberos ticket into the > > configuration. > > > > I think i'm understanding the plan here. NameNode.java reads the location > of the namenode instance from config. So, we'll inset username and groups > into the config. On the first iteration, this will not be authenticated. > This information will be passed to the namenode server, who will translate > the name and groups to UID and GID, which are stored with the files. > > Sounds like a reasonable thing. There's one problem here, that being that > each user will require their own config file. This is not the way I've seen > hadoop currently run, but if we all agree that this is the way to go, I'll > begin a prototype very soon. > ok, I have an architectural question. I think I get the client-side stack. DFSClient creates a proxy, which connects to the namenode. This all uses ClientProtocol. So, to implement what I need I'll probably need to modify ClientProtocol and NameNode. Now we have the whole DistributedFileSystem and FileSystem stuff. I see the cache in FileSystem, I just don't see where in the stack this is. It's server-side I assume. I see where we instantiate the NameNode on the server, but it seemingly just deals with blocks. Where's the filesystem at? >> We should have an equivalent of /etc/groups in the namenode. > > > > > > So, what I think it does is that it validates that the user really is > > > user@client.com. [ ... ] > > > > > > There's a chance kerberos actually validates that it's user@server.com > > . > > > > Kerberos validates that a user is user@domain, where both the user and > > the domain are part of Kerberos, not some host. Initially we'll not do > > any user validation, but just trust the username sent. > > > > There's accountability, but not great protection. If someone put their > client into kerberos and it was accepted, they could take any role they > wanted. > > That is, if I understood what you are talking about. > > > We might be able to get away without groups, but it would be awkward. > > For example, if the default file permission is -rw-rw-r-, then, without > > groups, anyone can read any file, but folks can only remove files > > they've created. That doesn't permit read/write sharing of data w/o > > changing its owner. > > > > We probably also need a "root" username that can do anything. > > > I think groups and root are easy, so I plan to implement those initially > as well. Is there any more reasonable way to do root than just hardcoding > that root can do anything? I thought about adding root to all groups, but > there's a chance that a file had no groups. I guess I could add one root > group that simply contains root. That would allow the service to allow > others to run as root as well. > > > Doug > > > > ------=_Part_84109_17985382.1177014054172--