hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-1609) Remove hard coded uid from Ozone docker image
Date Tue, 18 Jun 2019 21:22:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867053#comment-16867053
] 

Eric Yang commented on HDDS-1609:
---------------------------------

[~elek] Tested on fresh test node, and I am able to get the smoke test to pass.  However,
the privileges escalation problem still occurs on the new node.

{code}
[tester@eyangkube result]$ id tester
uid=2002(tester) gid=2002(tester) groups=2002(tester),993(docker)
[tester@eyangkube result]$ id centos
uid=1000(centos) gid=1000(centos) groups=4(adm),10(wheel),190(systemd-journal),1000(centos)
[tester@eyangkube result]$ ls -la
total 660
drwxrwxrwx 2 tester tester    146 Jun 18 11:44 .
drwxrwxr-x 3 tester tester     95 Jun 18 11:43 ..
-rw-rw-r-- 1 tester tester 152818 Jun 18 11:44 docker-ozone-basic-scm.log
-rw-r--r-- 1 centos users  234844 Jun 18 11:44 log.html
-rw-r--r-- 1 centos users  230000 Jun 18 11:44 report.html
-rw-r--r-- 1 centos users   25288 Jun 18 11:44 robot-ozone-auditparser-om.xml
-rw-r--r-- 1 centos users   16733 Jun 18 11:44 robot-ozone-basic-scm.xml
{code}

The building user is tester, but all the test results are owned by centos user.  This can
rule out this as a centos specific problem.  I disabled "stop_docker_env" from test.sh, and
login to container to find ozone-site.xml.  Ozone-site.xml is empty, but the tests are able
to pass.  I am not sure why tests are passing, it is clear that ozone cluster is defective
in the arranged setup.

Please educate me how does ozone-site.xml is supposed to get rewritten, when mounting ozone-0.5.0-SNAPSHOT
into hadoop-runner image in /opt/hadoop?  The files in the image is owned by host user:

{code}bash-4.2$ ls -la
total 96
drwxr-xr-x. 3 501 1001  4096 Jun 14 20:58 .
drwxrwxr-x. 3 501 1001    20 Jun 14 20:58 ..
-rw-rw-r--. 1 501 1001   774 Jun 14 20:58 core-site.xml
-rw-rw-r--. 1 501 1001  3571 Jun 14 20:58 dn-audit-log4j2.properties
-rw-rw-r--. 1 501 1001  3999 Jun 14 20:58 hadoop-env.cmd
-rw-rw-r--. 1 501 1001 16873 Jun 14 20:58 hadoop-env.sh
-rw-rw-r--. 1 501 1001  3321 Jun 14 20:58 hadoop-metrics2.properties
-rw-rw-r--. 1 501 1001 11392 Jun 14 20:58 hadoop-policy.xml
-rw-rw-r--. 1 501 1001  3414 Jun 14 20:58 hadoop-user-functions.sh.example
-rw-rw-r--. 1 501 1001  5701 Jun 14 20:58 log4j.properties
-rw-rw-r--. 1 501 1001  3176 Jun 14 20:58 network-topology-default.xml
-rw-rw-r--. 1 501 1001  3380 Jun 14 20:58 network-topology-nodegroup.xml
-rw-rw-r--. 1 501 1001  3571 Jun 14 20:58 om-audit-log4j2.properties
-rw-rw-r--. 1 501 1001   978 Jun 14 20:58 ozone-site.xml
-rw-rw-r--. 1 501 1001  3574 Jun 14 20:58 scm-audit-log4j2.properties
drwxrwxr-x. 2 501 1001    24 Jun 14 20:58 shellprofile.d
-rw-rw-r--. 1 501 1001  2316 Jun 14 20:58 ssl-client.xml.example
-rw-rw-r--. 1 501 1001  2697 Jun 14 20:58 ssl-server.xml.example
-rw-rw-r--. 1 501 1001    10 Jun 14 20:58 workers
bash-4.2$ id hadoop
uid=1000(hadoop) gid=100(users) groups=100(users)
{code}

envtoconf.py fails because hadoop user does not match uid/gid permission with host level user.
 I think tests failing on my main development machine is accurate reflection of the current
state of developer container image.  Host filesystem and docker environment must have consistent
view of file system attributes for the smoke tests to work.  This is lacking in the current
implementation.  

Smoke tests are passing on test node for the wrong reason.  It seems to be working on local
file system because the config are empty.



> Remove hard coded uid from Ozone docker image
> ---------------------------------------------
>
>                 Key: HDDS-1609
>                 URL: https://issues.apache.org/jira/browse/HDDS-1609
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Priority: Major
>             Fix For: 0.5.0
>
>         Attachments: linux.txt, log.html, osx.txt, report.html
>
>
> Hadoop-runner image is hard coded to [USER hadoop|https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile#L45]
and user hadoop is hard coded to uid 1000.  This arrangement complicates development environment
where host user is different uid from 1000.  External bind mount locations are written data
as uid 1000.  This can prevent development environment from clean up test data.  
> Docker documentation stated that "The best way to prevent privilege-escalation attacks
from within a container is to configure your container’s applications to run as unprivileged
users."  From Ozone architecture point of view, there is no reason to run Ozone daemon to
require privileged user or hard coded user.
> h3. Solution 1
> It would be best to support running docker container as host user to reduce friction.
 User should be able to run:
> {code}
> docker run -u $(id -u):$(id -g) ...
> {code}
> or in docker-compose file:
> {code}
> user: "${UID}:${GID}"
> {code}
> By doing this, the user will be name less in docker container.  Some commands may warn
that user does not have a name.  This can be resolved by mounting /etc/passwd or a file that
looks like /etc/passwd that contain host user entry.
> h3. Solution 2
> Move the hard coded user to range < 200.  The default linux profile reserves service
users < 200 to have umask that keep data private to service user or group writable, if
service shares group with other service users.  Register the service user with Linux vendors
to ensure that there is a reserved uid for Hadoop user or pick one that works for Hadoop.
 This is a longer route to pursuit, and may not be fruitful.  
> h3. Solution 3
> Default the docker image to have sssd client installed.  This will allow docker image
to see host level names by binding sssd socket.  The instruction for doing this is located
at in [Hadoop website| https://hadoop.apache.org/docs/r3.1.2/hadoop-yarn/hadoop-yarn-site/DockerContainers.html#User_Management_in_Docker_Container].
> The pre-requisites for this approach will require the host level system to have sssd
installed.  For production system, there is a 99% chance that sssd is installed.
> We may want to support combined solution of 1 and 3 to be proper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message