hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sekhon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11400) Automatic HDFS Home Directory Creation
Date Fri, 10 Feb 2017 13:48:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861287#comment-15861287

Hari Sekhon commented on HDFS-11400:

[~aw] Good question. Where are such fake users coming from? Given NN resolves users from OS
/ Kerberos, this would mean the OS / Kerberos systems have already been compromised to have
had fake users added?

Putting a configurable user/group filter to only automatically create home directories for
a whitelisted regex of users/groups could form a layer of protection. For example in a cluster
integrated with Active Directory which might have 20,000 users you may only want 100 of those
users actually using the Hadoop cluster. Although in practice this filtering is usually already
done at the OS level via SSSD etc.

Another layer of protection could be a setting on max enumerated users for which home directories
were going to be automatically created or max number of home directories already in existence
- if the enumerated users or the number of existing home directories is too high, eg. 1000
then log it and disable auto-creation until resolved to prevent said memory explosion. Really
the second idea on number of home directories in existence before disabling auto home directory
creation would be better as it shouldn't really be enumerating users but rather creating the
home directory on the fly each time a single new user is first used on the cluster and no
home directory exists for the user.

How about these ideas?

This would stop various jobs from breaking where they try to put staging files etc in home
directories that don't exist because they haven't been manually created yet or scripted (it
seems silly in retrospect for admins to keep writing scripts to do this for every client when
this could be solved once and for all via NN logic).

> Automatic HDFS Home Directory Creation
> --------------------------------------
>                 Key: HDFS-11400
>                 URL: https://issues.apache.org/jira/browse/HDFS-11400
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs, namenode
>    Affects Versions: 2.7.1
>         Environment: HDP 2.4.2
>            Reporter: Hari Sekhon
> Feature Request to add automatic home directory creation for HDFS users when they are
first resolved by the NameNode if their home directory does not already exist, using configurable
umask defaulting to 027.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message