hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christophe Taton (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1298) adding user info to file
Date Tue, 24 Jul 2007 07:59:31 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Christophe Taton updated HADOOP-1298:

    Attachment: hadoop-dev-20070724-0020.patch.gz

Hi all,

Thanks for the time you spent on my patch, I really appreciate that. Here are some comments
on the comments:

> - I have tried to apply hadoop-dev-20070720-1633.patch.gz to the current trunk but the
program cannot be compiled.  Did you update your version with current trunk?  It seems to
me that the patch is out dated

I updated and synced with the current trunk version a few minutes before creating the patch.
However the patch has not included the changes I made on FSDirectory! Attached to this comment
is a patch containing all the changes I made, including those of FSDirectory. The new patch
works. Sorry for this big mess.

> - Using DFSClientContext as a parameter in all method calls for checking permissions.
 The idea is not right.  We should pass some kind of credential / ticket / signature, instead
of a context.

This is a matter of vocabulary. I agree though. This parameter is only a way to identify the
emitter of a request and account for it. It currently contains the user name because I do
not provide any authentication or sessions. Which term do you prefer?

> - I think we need authentication server(s) to manage users (i.e. UserDatabase should
be resided in an authentication servers).  System components authenticate itself with the
server and obtain credentials.  It reduces the complication and use of resource in Namenode.

Doug suggested to use Kerberos in the long-term; for the short-term, UnixSystem#getUserName()
is enough.
As I said in previous comments, the short-term purpose of this patch is to prevent people
from breaking files by accident and not to provide strong security.

>- INode has
> private String owner;
> private String group;
> private short mode;
> It is better to have a FilePermission class (which extends java.security.PermissionCollection)
to encapsulate them.  I think it is valid to assume that there are many file having the same
permissions.  Then, we might be able to save spaces.  Also, we will have the flexibility that
use permission model other than posix.

I am not really comfortable with sharing permissions among files. Sharing owners and groups
strings is not so complex and almost already achieved in my patch if we want to spare memory.
Sharing a permission means we need to duplicate a permission on write and to maintain a map
of existing permissions.

> - It seems to me that the patch was not careful written: for example, in class Path,
there is a constant SEPARATOR (= "/") but the codes inside keep using "/".

I apologize, there are a few places where I did not change the already existing "/" into a
Path.SEPARATOR. I'll change them from now. That said I did not introduce any new "/" except
in the class Path itself, in the Path normalization method.

> - HDFS should have its own user management.  It is not right to use arbitrary username
(like OS's username), especially there is no real security implemented.  It will bring a lot
of problems by user's careless mistakes later.

Again, the short-term purpose is to prevent people from deleting or altering others files
by accident. What is you conception of "real security" here? To my mind, this purpose is achieved,
assuming that people have no will to break the system.

> - All entries (user, host, etc.) should have its own principal.  Something like "nobody"
is just a shared account.

What do you mean here?

> - The patch changed a lot of unnecessary code formatting.  That's why it has 8000+ lines.

I tried to follow the guidelines and to avoid unnecessary code formatting. Seems that I failed...
I'll do better next time.
Anyway the patch is big as I changed FSDirectory and INode a lot. Apparently too much. I agree
that the refactoring of FSDirectory.INode makes the patch unreadable for FSDirectory and INode.

> Also, the codes have a lot of lines longer than 80 characters.

Fixed. For most of those wide lines, I did not want to break the line into multiple lines
so as to keep current formatting. Apparently, the 80 characters rules has priority over this.

> - FileMode and FileAcccessMode are overlapped.

True. I started to write FileAccessMode as it is "pure Java5". As I wrote it, I started to
think about the indirections and the overhead introduced by this enum. I sticked to the regular
integer version as it is convenient to write and test file modes against integers.

>  FileAcccessMode using enum is better.


> - why change FSDirectory.INode to INode?  Again, it makes the patch hard to follow.

I asked whether this change would cause an issue or not. As I had no answer in time, I decided
to make this change as :
1. There was no need in design for INode to be included in FSDirectory.
2. Including INode inside FSDirectory made the file big and hard to read without any reason.
3. The code was already tagged with "TODO: Factor out INode as a standalone class"

My question is: what should be easy to read? the patch or the code? I would like to have both,
in an ideal world.

> - bad idea to use static variables for non-constants, e.g. in UserDatabase,
>  private final static UserDatabase db = new UserDatabase();
> it will be difficult to manage, e.g. restarting NameNode.

I agree. I'll change this.

> - There are un-used classes Pair and Triplet.  Programming exercises?

Pair is unused, I remove it. Triplet is used in the changes I made to FSDirectory, which are
not visible in the previous crappy patch I provided. The new patch now contains the changes
I made to FSDirectory. I needed to have a method returning 3 values. I am ready to change
this if you do not like it. I prefered that solution over single-sized array parameters.

> - missing javadoc for many methods.

The javadoc I provided is almost certainly incomplete. But *many* methods?

> For a messy patch like this, how many of you think that it is bug free and does not hide
some back doors?

Is this your way to encourage people contributing to Hadoop?

> I suggest to remove all the unnecessary changes and work on a small core patch and I
am willing to do it.

I put a lot of energy in designing efficient and clean ways to integrate permissions to hdfs
as well as in providing an accurate javadoc. The patch has actually become big as I iterated
on the changes many times and diverged from the trunk. To my mind, this does not mean that
my code is crappy.
Of course I am willing to adapt and break this patch into many small and more understandable
patches to make you more comfortable.

Again, thanks for the time you spent on my patch.
Christophe T.

> adding user info to file
> ------------------------
>                 Key: HADOOP-1298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1298
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, fs
>            Reporter: Kurtis Heimerl
>             Fix For: 0.15.0
>         Attachments: hadoop-dev-20070720-1633.patch.gz, hadoop-dev-20070724-0020.patch.gz,
hadoop-user-munncha.patch, hadoop-user-munncha.patch, hadoop-user-munncha.patch, hadoop-user-munncha.patch10,
hadoop-user-munncha.patch11, hadoop-user-munncha.patch12, hadoop-user-munncha.patch13, hadoop-user-munncha.patch14,
hadoop-user-munncha.patch15, hadoop-user-munncha.patch16, hadoop-user-munncha.patch17, hadoop-user-munncha.patch4,
hadoop-user-munncha.patch5, hadoop-user-munncha.patch6, hadoop-user-munncha.patch7, hadoop-user-munncha.patch8,
hadoop-user-munncha.patch9, hdfs-access-control.patch.gz
> I'm working on adding a permissions model to hadoop's DFS. The first step is this change,
which associates user info with files. Following this I'll assoicate permissions info, then
block methods based on that user info, then authorization of the user info. 
> So, right now i've implemented adding user info to files. I'm looking for feedback before
I clean this up and make it offical. 
> I wasn't sure what release, i'm working off trunk. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message