hadoop-hdfs-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From e..@apache.org
Subject svn commit: r1035552 - in /hadoop/hdfs/trunk: CHANGES.txt src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
Date Tue, 16 Nov 2010 08:10:02 GMT
Author: eli
Date: Tue Nov 16 08:10:02 2010
New Revision: 1035552

URL: http://svn.apache.org/viewvc?rev=1035552&view=rev
Log:
HDFS-1387. Update HDFS permissions guide for security. Contributed by Todd Lipcon


Modified:
    hadoop/hdfs/trunk/CHANGES.txt
    hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml

Modified: hadoop/hdfs/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/CHANGES.txt?rev=1035552&r1=1035551&r2=1035552&view=diff
==============================================================================
--- hadoop/hdfs/trunk/CHANGES.txt (original)
+++ hadoop/hdfs/trunk/CHANGES.txt Tue Nov 16 08:10:02 2010
@@ -191,6 +191,8 @@ Trunk (unreleased changes)
     HDFS-1187. Modify fetchdt to allow renewing and canceling token.
     (Owen O'Malley and Kan Zhang via jghoman)
 
+    HDFS-1387. Update HDFS permissions guide for security. (Todd Lipcon via eli)
+
   OPTIMIZATIONS
 
     HDFS-1140. Speedup INode.getPathComponents. (Dmytro Molkov via shv)

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml?rev=1035552&r1=1035551&r2=1035552&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
(original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
Tue Nov 16 08:10:02 2010
@@ -71,26 +71,41 @@
 
 <section><title>User Identity</title>
 <p>
-In this release of Hadoop the identity of a client process is just whatever the host operating
system says it is. For Unix-like systems,
+As of Hadoop 0.22, Hadoop supports two different modes of operation to determine the user's
identity, specified by the
+<code>hadoop.security.authentication</code> property:
 </p>
-<ul>
-<li>
-   The user name is the equivalent of <code>`whoami`</code>;
-</li>
-<li>
-   The group list is the equivalent of <code>`bash -c groups`</code>.
-</li>
-</ul>
+<dl>
+  <dt><code>simple</code></dt>
+  <dd>In this mode of operation, the identity of a client process is determined by
the host operating system. On Unix-like systems,
+  the user name is the equivalent of <code>`whoami`</code>.</dd>
+  <dt><code>kerberos</code></dt>
+  <dd>In Kerberized operation, the identity of a client process is determined by its
Kerberos credentials. For example, in a
+  Kerberized environment, a user may use the <code>kinit</code> utility to obtain
a Kerberos ticket-granting-ticket (TGT) and
+  use <code>klist</code> to determine their current principal. When mapping a
Kerberos principal to an HDFS username, all <em>components</em> except for the
<em>primary</em> are dropped. For example, a principal <code>todd/foobar@CORP.COMPANY.COM</code>
will act as the simple username <code>todd</code> on HDFS.
+  </dd>
+</dl>
+<p>
+Regardless of the mode of operation, the user identity mechanism is extrinsic to HDFS itself.
+There is no provision within HDFS for creating user identities, establishing groups, or processing
user credentials.
+</p>
+</section>
 
+<section><title>Group Mapping</title>
+<p>
+Once a username has been determined as described above, the list of groups is determined
by a <em>group mapping
+service</em>, configured by the <code>hadoop.security.group.mapping</code>
property.
+The default implementation, <code>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</code>,
will shell out
+to the Unix <code>bash -c groups</code> command to resolve a list of groups for
a user.
+</p>
 <p>
-In the future there will be other ways of establishing user identity (think Kerberos, LDAP,
and others). There is no expectation that 
-this first method is secure in protecting one user from impersonating another. This user
identity mechanism combined with the 
-permissions model allows a cooperative community to share file system resources in an organized
fashion.
+For HDFS, the mapping of users to groups is performed on the NameNode. Thus, the host system
configuration of
+the NameNode determines the group mappings for the users.
 </p>
 <p>
-In any case, the user identity mechanism is extrinsic to HDFS itself. There is no provision
within HDFS for creating user identities, 
-establishing groups, or processing user credentials.
+Note that HDFS stores the user and group of a file or directory as strings; there is no conversion
from user and
+group identity numbers as is conventional in Unix.
 </p>
+
 </section>
 
 <section> <title>Understanding the Implementation</title>
@@ -104,14 +119,6 @@ A second request made to find additional
 that already knows the blocks of the file. With the addition of permissions, a client's access
to a file may be withdrawn between 
 requests. Again, changing permissions does not revoke the access of a client that already
knows the file's blocks.
 </p>
-<p>
-The MapReduce framework delegates the user identity by passing strings without special concern
for confidentiality. The owner 
-and group of a file or directory are stored as strings; there is no conversion from user
and group identity numbers as is conventional in Unix.
-</p>
-<p>
-The permissions features of this release did not require any changes to the behavior of data
nodes. Blocks on the data nodes 
-do not have any of the <em>Hadoop</em> ownership or permissions attributes associated
with them.
-</p>
 </section>
      
 <section> <title>Changes to the File System API</title>
@@ -198,19 +205,12 @@ permission parameter <em>P</em>) is used
 
 <section> <title>The Web Server</title>
 <p>
-The identity of the web server is a configuration parameter. That is, the name node has no
notion of the identity of 
+By default, the identity of the web server is a configuration parameter. That is, the name
node has no notion of the identity of 
 the <em>real</em> user, but the web server behaves as if it has the identity
(user and groups) of a user chosen 
-by the administrator. Unless the chosen identity matches the super-user, parts of the name
space may be invisible 
+by the administrator. Unless the chosen identity matches the super-user, parts of the name
space may be inaccessible
 to the web server.</p>
 </section>
 
-<section> <title>On-line Upgrade</title>
-<p>
-If a cluster starts with a version 0.15 data set (<code>fsimage</code>), all
files and directories will have 
-owner <em>O</em>, group <em>G</em>, and mode <em>M</em>,
where <em>O</em> and <em>G</em> 
-are the user and group identity of the super-user, and <em>M</em> is a configuration
parameter. </p>
-</section>
-
 <section> <title>Configuration Parameters</title>
 <ul>
 	<li><code>dfs.permissions = true </code>



Mime
View raw message