hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurtis Heimerl" <munn...@gmail.com>
Subject Re: Fwd: hadoop file permissions
Date Thu, 19 Apr 2007 20:20:54 GMT
Some other notes/questions:

On 4/19/07, Kurtis Heimerl <munncha@gmail.com> wrote:
> On 4/19/07, Doug Cutting <cutting@apache.org> wrote:
> >
> > Kurtis Heimerl wrote:
> > >> Yes, DFSClient will need to pass the user to the namenode.
> > >>
> > >> Perhaps the username should be put in the FileSystem's URI.  So an
> > HDFS
> > >> URI would become hdfs://user@namenode:5555/foo/bar.  URI's without a
> > >> username would have "other" access (typically read-only).
> > >
> > > That's reasonable. I don't know how kerberos plays with that though.
> >
> > I chatted with Owen a bit yesterday about this and think it's better to
> > keep the username in the config.  A FileSystem is created given a URI
> > and a Configuration.  FileSystem's are currently cached, keyed on the
> > URI's protocol and authority (host & port, typically).  We should add
> > the configuration to the cache key too, so that different FileSystem
> > instances are used for different users.  That permits FileSystem
> > implementations to use arbitrary config properties in their ctor.
> >
> > I think we should be able to put a Kerberos ticket into the
> > configuration.
> I think i'm understanding the plan here. NameNode.java reads the location
> of the namenode instance from config. So, we'll inset username and groups
> into the config. On the first iteration, this will not be authenticated.
> This information will be passed to the namenode server, who will translate
> the name and groups to UID and GID, which are stored with the files.
> Sounds like a reasonable thing. There's one problem here, that being that
> each user will require their own config file. This is not the way I've seen
> hadoop currently run, but if we all agree that this is the way to go, I'll
> begin a prototype very soon.

ok, I have an architectural question. I think I get the client-side stack.
DFSClient creates a proxy, which connects to the namenode. This all uses
ClientProtocol. So, to implement what I need I'll probably need to modify
ClientProtocol and NameNode.

Now we have the whole DistributedFileSystem and FileSystem stuff. I see the
cache in FileSystem, I just don't see where in the stack this is. It's
server-side I assume. I see where we instantiate the NameNode on the server,
but it seemingly just deals with blocks. Where's the filesystem at?

>> We should have an equivalent of /etc/groups in the namenode.
> > >
> > > So, what I think it does is that it validates that the user really is
> > > user@client.com. [ ... ]
> > >
> > > There's a chance kerberos actually validates that it's user@server.com
> > .
> >
> > Kerberos validates that a user is user@domain, where both the user and
> > the domain are part of Kerberos, not some host.  Initially we'll not do
> > any user validation, but just trust the username sent.
> There's accountability, but not great protection. If someone put their
> client into kerberos and it was accepted, they could take any role they
> wanted.
> That is, if I understood what you are talking about.
> We might be able to get away without groups, but it would be awkward.
> > For example, if the default file permission is -rw-rw-r-, then, without
> > groups, anyone can read any file, but folks can only remove files
> > they've created.  That doesn't permit read/write sharing of data w/o
> > changing its owner.
> >
> > We probably also need a "root" username that can do anything.
> I think groups and root are easy, so I plan to implement those initially
> as well. Is there any more reasonable way to do root than just hardcoding
> that root can do anything? I thought about adding root to all groups, but
> there's a chance that a file had no groups. I guess I could add one root
> group that simply contains root. That would allow the service to allow
> others to run as root as well.
> Doug
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message