hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-7430) User and Group mapping are incorrect in docker container
Date Wed, 08 Nov 2017 18:08:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244443#comment-16244443
] 

Eric Yang edited comment on YARN-7430 at 11/8/17 6:07 PM:
----------------------------------------------------------

[~shanekumpf@gmail.com] . Thank you for explaining your point of view.  I understand how you
arrived at these conclusions, but some use cases can not be satisfied by the current implementation.
 

{quote}
User "foo" in the container does not have permission to execute the launch script owned by
"skumpf" and thus the container will fail to launch with a permission denied error. We need
the -user/uid option even if privileged is requested, because without it, we have no idea
what user the container will run as.
{quote}

What is the point of using privileged flag, if the process can only run as "skump" to run
properly for privileged container?  When container is granted with root power, root user should
have ability to do anything, why drop that privilege away then reacquire it later using sticky
bit?  It is counter intuitive.

Let review the ground rules that docker recommends, and what we are recommending to Hadoop
users.

# Docker security document clearly stated that docker must be run by trusted user only.  This
means user either have sudo privileges or they are part of docker group.
# Privileged container allows ENTRYPOINT to spawn multi-user environment such as systemd or
init like environment for multi-user support.
# Hadoop YARN user can be a trusted user to spawn docker containers on behave of the end user.
# Hadoop simulates doAs call through container-executor, therefore docker security recommendation
stay intact.  If container must run for end user who isn't part of privileged user nor docker
group, then precaution must be taken to secure point of entry by yarn user or container-executor.
# Docker does not know about external users and group on LDAP, hence use of {{\-\-user username}}
is essentially limited to container's {{/etc/passwd}} and {{/etc/group}} to lookup group membership.
 Users/Group can be programmed into docker container build, however this solution can not
be generalized for LDAP users in Hadoop eco-system.  We don't want to end up rebuilding images,
each time a new LDAP user is added.
# Docker added {{\-\-user uid:gid}} and {{\-\-group-add}} to assign user credential and group
membership without the user depends on /etc/passwd and /etc/group for lookup for dynamic users.

In order to resolve the conflicting user management between docker and Hadoop.  We must streamline
the implementation to have capacity of supporting multi-users docker container (privileged
container) as well as single LDAP user container (non-privileged container).  Privileged container
can only be spawned by trusted user for trusted user.  Hence, the privileged container image
can contain multiple users that is already pre-approved by system administrator.  Privileged
container can acquire additional resources using mount points, and consistent file system
ACL inside and outside of container governs the overall security.  

There should never be a case where we allow localized resource for {{skump}} to work as {{foo}}
user without properly secure file system ACL.  At least we don't want to make this case work
to ensure file system ACL rules are not broken.  {{skump}} must do more work to secure localize
resource with proper permission, if he has the power.  Ultimately, file system permission
is the last line of security defense that we have for storing files in HDFS via NFS mount
point.

>From this point of view, does it make more sense to run {{\-\-privileged}} without {{\-\-user
username}}?




was (Author: eyang):
[~shanekumpf@gmail.com] . Thank you for explaining your point of view.  I understand how you
arrived at these conclusions, but some use cases can not be satisfied by the current implementation.
 

{quote}
User "foo" in the container does not have permission to execute the launch script owned by
"skumpf" and thus the container will fail to launch with a permission denied error. We need
the -user/uid option even if privileged is requested, because without it, we have no idea
what user the container will run as.
{quote}

What is the point of using privileged flag, if the process can only run as "skump" to run
properly for privileged container?  When container is granted with root power, root user should
have ability to do anything, why drop that privilege away then reacquire it later using sticky
bit?  It is counter intuitive.

Let review the ground rules that docker recommends, and what we are recommending to Hadoop
users.

# Docker security document clearly stated that docker must be run by trusted user only.  This
means user either have sudo privileges or they are part of docker group.
# Privileged container allows ENTRYPOINT to spawn multi-user environment such as systemd or
init like environment for multi-user support.
# Hadoop YARN user can be a trusted user to spawn docker containers on behave of the end user.
# Hadoop simulates doAs call through container-executor, therefore docker security recommendation
stay intact.  If container must run for end user who isn't part of privileged user nor docker
group, then precaution must be taken to secure point of entry by yarn user or container-executor.
# Docker does not know about external users and group on LDAP, hence use of {{--user [username]}}
is essentially limited to container's {{/etc/passwd}} and {{/etc/group}} to lookup group membership.
 Users/Group can be programmed into docker container build, however this solution can not
be generalized for LDAP users in Hadoop eco-system.  We don't want to end up rebuilding images,
each time a new LDAP user is added.
# Docker added {{--user uid:gid}} and {{--group-add}} to assign user credential and group
membership without the user depends on /etc/passwd and /etc/group for lookup for dynamic users.

In order to resolve the conflicting user management between docker and Hadoop.  We must streamline
the implementation to have capacity of supporting multi-users docker container (privileged
container) as well as single LDAP user container (non-privileged container).  Privileged container
can only be spawned by trusted user for trusted user.  Hence, the privileged container image
can contain multiple users that is already pre-approved by system administrator.  Privileged
container can acquire additional resources using mount points, and consistent file system
ACL inside and outside of container governs the overall security.  

There should never be a case where we allow localized resource for {{skump}} to work as {{foo}}
user without properly secure file system ACL.  At least we don't want to make this case work
to ensure file system ACL rules are not broken.  Ultimately, file system permission is the
last line of security defense that we have for storing files in HDFS via NFS mount point.

>From this point of view, does it make more sense to run {{--privileged}} without {{--user
username}}?



> User and Group mapping are incorrect in docker container
> --------------------------------------------------------
>
>                 Key: YARN-7430
>                 URL: https://issues.apache.org/jira/browse/YARN-7430
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: security, yarn
>    Affects Versions: 2.9.0, 3.0.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>         Attachments: YARN-7430.001.patch
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to enforce
user and group for the running user.  In YARN-6623, this translated to --user=test --group-add=group1.
 The code no longer enforce group correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group information
to exist in container to translate username and group to uid/gid.  For users on LDAP, there
is no good way to populate container with user and group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message