hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
Date Fri, 20 Nov 2015 15:28:11 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018151#comment-15018151
] 

Jason Lowe commented on MAPREDUCE-6550:
---------------------------------------

Thanks for updating the patch!  Looks good except for one minor thing: usually permission
updates after the fact are a sign of potential security problems unless the permissions are
being widened instead of restricted.  Something could come along and get access to the file
after it's written but before the permissions are fixed.  Would it make more sense to update
the filesystem umask before running the archive so the files are automatically created with
the proper permissions?


> archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6550
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch
>
>
> The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell app.  When
using the DefaultContainerExecutor, this means that the job will actually run as the Yarn
user, so the resulting har files are owned by the Yarn user instead of the original owner.
The permissions are also now world-readable.
> In the below example, the archived logs are owned by 'yarn' instead of 'paul' and are
now world-readable:
> {noformat}
> [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs
> ...
> drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005
> drwxr-xr-x   - yarn  hadoop          0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har
> -rw-r--r--   3 yarn  hadoop          0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS
> -rw-r--r--   3 yarn  hadoop       1256 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index
> -rw-r--r--   3 yarn  hadoop         24 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex
> -rw-r--r--   3 yarn  hadoop    8451177 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0
> drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006
> -rw-r-----   3 paul  hadoop       1155 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041
> -rw-r-----   3 paul  hadoop       4880 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message