hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1298) adding user info to file
Date Fri, 21 Sep 2007 00:57:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529297

Allen Wittenauer commented on HADOOP-1298:

FWIW, I've been looking at ApacheDS in its stand-alone mode to provide the LDAP and Kerberos
infrastructure for the grids at Yahoo!.  While it is a very young product, it holds a lot
of promise.... 

For (likely very) small sites, I could see this as a potential win.  They probably have a
namenode with enough memory.  If they don't, then they are essentially in the same boat as
the large site situation...

For large sites, there is a very high probability they already have some sort of major naming
services implementation, be it Kerberos, LDAP, or otherwise.  They are going to want to integrate
Hadoop into those services, which means that the DS instance embedded would need to be able
to replicate data from the master source.    Chances are very high that their replication
technologies aren't going to work with ApacheDS and the embedded DS is going to end up being
nothing but a referral server, assuming it has that functionality.   Sizing-wise, we've already
seen what happens in the xx million case with 0.13 on a 16GB namenode.  I don't think embedding
or running a DS/KDC side-by-side is viable without bigger hardware.   If  one has that bigger
hardware, they are just as likely to run a copy of their DS/KDC bits instead of using the
embedded one anyway.

In the case of the places without a KDC or even a DS, it might be useful instead to recommend
as part of the Hadoop documentation that they setup naming services replication local to the
namenode, (some value) per data nodes, and (some value) per MR nodes.  If they don't have
any major naming services in place already, then update the ApacheDS section where they talk
about grids ( about half way down the page http://cwiki.apache.org/DIRxINTEROP/ ) to include
chat about using it with Hadoop and then point to that.

[Sidenote: IIRC, the MIT (and Sun and ...) KDC keeps the entire Kerberos DB in memory.  I
haven't worked enough with the ApacheDS to see how it functions in this regard.]

> adding user info to file
> ------------------------
>                 Key: HADOOP-1298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1298
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs, fs
>            Reporter: Kurtis Heimerl
>            Assignee: Christophe Taton
>             Fix For: 0.15.0
>         Attachments: 1298_2007-09-06b.patch, 1298_2007-09-07g.patch, hadoop-user-munncha.patch17
> I'm working on adding a permissions model to hadoop's DFS. The first step is this change,
which associates user info with files. Following this I'll assoicate permissions info, then
block methods based on that user info, then authorization of the user info. 
> So, right now i've implemented adding user info to files. I'm looking for feedback before
I clean this up and make it offical. 
> I wasn't sure what release, i'm working off trunk. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message