hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7146) NFS ID/Group lookup requires SSSD enumeration on the server
Date Mon, 10 Nov 2014 17:53:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205080#comment-14205080

Yongjun Zhang commented on HDFS-7146:

Hi Guys,

Sorry to get back late. I just uploaded a patch on top of the HADOOP-11195. I'd appreciate
it that you could help reviewing it when you have time.

A recap of the solution:

# At initialization, the maps are empty
# Both users/groups/ids are added to the map on demand (e.g. when requested),
# When groupId is requested for a given groupName, and if the groupName is numerical, the
full group map is loaded (this is lazy full list load I referred to earlier)
# Periodically update the cached maps for both user and group. What I do here is actually
to clear (reinitialize the maps). I imaged that some users and groups might be removed (for
example, a user changed job, so their entries need to be removed). 
# Steps 2 and 3 will be repeated. 

BTW, because now we changed to incrementally updating the map, there tends to be a lot of
messages like
LOG.info("Updated " + mapName + " map size: " + map.size());
I took the liberty to change it to a debug message in the patch.


> NFS ID/Group lookup requires SSSD enumeration on the server
> -----------------------------------------------------------
>                 Key: HDFS-7146
>                 URL: https://issues.apache.org/jira/browse/HDFS-7146
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: nfs
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7146.001.patch, HDFS-7146.002.allIncremental.patch, HDFS-7146.003.patch,
> The current implementation of the NFS UID and GID lookup works by running 'getent passwd'
with an assumption that it will return the entire list of users available on the OS, local
and remote (AD/etc.).
> This behaviour of the command is advised to be and is prevented by administrators in
most secure setups to avoid excessive load to the ADs involved, as the # of users to be listed
may be too large, and the repeated requests of ALL users not present in the cache would be
too much for the AD infrastructure to bear.
> The NFS server should likely do lookups based on a specific UID request, via 'getent
passwd <UID>', if the UID does not match a cached value. This reduces load on the LDAP
backed infrastructure.
> Thanks [~qwertymaniac] for reporting the issue.

This message was sent by Atlassian JIRA

View raw message