hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5767) Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes
Date Mon, 20 Jan 2014 18:37:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876668#comment-13876668
] 

Yongjun Zhang commented on HDFS-5767:
-------------------------------------

Hi [~brandonli], 

About the example you provided (user "hadoop" with id 123...), I assume you meant when we
issue command "getent passwd hadoop", it will return the first match found for user "hadoop"
based on the search order configured in nsswtich.conf. 

But I would expect "getent passwd" (no user specified at command line) will still return two
entries for user "hadoop" based on the spec of getent:

"
       The getent command displays entries from databases supported by the Name Service Switch
libraries, which are configured
       in /etc/nsswitch.conf. If one or more key arguments are provided, then only the entries
that match the supplied keys will be 
       displayed.  Otherwise, if no key is provided, all entries will be displayed (unless
the database does not support enumeration).
"

May I know whether you meant

A. if nsswitch.conf is set up correctly, then "getent passwd" should return unique mapping
of all users?  

or 

B. duplicate enries should be physically removed from the databases (used by nsswitch.conf),
so "getent passwd" has no chance to return multiple entries for the same user?

Thanks.


> Nfs implementation assumes userName userId mapping to be unique, which is not true sometimes
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5767
>                 URL: https://issues.apache.org/jira/browse/HDFS-5767
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: nfs
>    Affects Versions: 2.3.0
>         Environment: With LDAP enabled
>            Reporter: Yongjun Zhang
>            Assignee: Brandon Li
>
> I'm seeing that the nfs implementation assumes unique <userName, userId> pair to
be returned by command  "getent paswd". That is, for a given userName, there should be a single
userId, and for a given userId, there should be a single userName.  The reason is explained
in the following message:
>  private static final String DUPLICATE_NAME_ID_DEBUG_INFO = "NFS gateway can't start
with duplicate name or id on the host system.\n"
>       + "This is because HDFS (non-kerberos cluster) uses name as the only way to identify
a user or group.\n"
>       + "The host system with duplicated user/group name or id might work fine most of
the time by itself.\n"
>       + "However when NFS gateway talks to HDFS, HDFS accepts only user and group name.\n"
>       + "Therefore, same name means the same user or same group. To find the duplicated
names/ids, one can do:\n"
>       + "<getent passwd | cut -d: -f1,3> and <getent group | cut -d: -f1,3>
on Linux systms,\n"
>       + "<dscl . -list /Users UniqueID> and <dscl . -list /Groups PrimaryGroupID>
on MacOS.";
> This requirement can not be met sometimes (e.g. because of the use of LDAP) Let's do
some examination:
> What exist in /etc/passwd:
> $ more /etc/passwd | grep ^bin
> bin:x:2:2:bin:/bin:/bin/sh
> $ more /etc/passwd | grep ^daemon
> daemon:x:1:1:daemon:/usr/sbin:/bin/sh
> The above result says userName  "bin" has userId "2", and "daemon" has userId "1".
>  
> What we can see with "getent passwd" command due to LDAP:
> $ getent passwd | grep ^bin
> bin:x:2:2:bin:/bin:/bin/sh
> bin:x:1:1:bin:/bin:/sbin/nologin
> $ getent passwd | grep ^daemon
> daemon:x:1:1:daemon:/usr/sbin:/bin/sh
> daemon:x:2:2:daemon:/sbin:/sbin/nologin
> We can see that there are multiple entries for the same userName with different userIds,
and the same userId could be associated with different userNames.
> So the assumption stated in the above DEBUG_INFO message can not be met here. The DEBUG_INFO
also stated that HDFS uses name as the only way to identify user/group. I'm filing this JIRA
for a solution.
> Hi [~brandonli], since you implemented most of the nfs feature, would you please comment?

> Thanks.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message