hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Why is my output directory owned by yarn?
Date Sat, 02 Nov 2013 00:22:02 GMT
The DefaultContainerExecutor isn't the one that can do setuid. The
LinuxContainerExecutor can do that.

On Fri, Nov 1, 2013 at 8:00 PM, Bill Sparks <jsparks@cray.com> wrote:
> We'll I thought I've set all this up correctly and on the NodeManager
> nodes can change to my user id, so general user authentication is working.
> But still the output is written as yarn. I guess my question is how to
> enable secure mode - I thought that was the default mode.
>
> When the containers are written they contain the correct user name
> (included).
>
> cat
> /tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicatio
> n_1383247324024_0005/container_1383247324024_0005_01_000001/launch_containe
> r.sh
> #!/bin/bash
>
> export
> YARN_LOCAL_DIRS="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/ap
> pcache/application_1383247324024_0005"
> export NM_HTTP_PORT="8042"
> export HADOOP_COMMON_HOME="/usr/lib/hadoop"
> export JAVA_HOME="/opt/java/jdk1.6.0_20"
> export HADOOP_YARN_HOME="/usr/lib/hadoop-yarn"
> export NM_HOST="nid00031"
> export
> CLASSPATH="$PWD:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/
> lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_MAPRED_HOME/*:$HA
> DOOP_MAPRED_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*:$HADOOP_
> MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapre
> duce/lib/*:job.jar/job.jar:job.jar/classes/:job.jar/lib/*:$PWD/*"
> export
> HADOOP_TOKEN_FILE_LOCATION="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/userca
> che/jdoe/appcache/application_1383247324024_0005/container_1383247324024_00
> 05_01_000001/container_tokens"
> export APPLICATION_WEB_PROXY_BASE="/proxy/application_1383247324024_0005"
> export JVM_PID="$$"
> export USER="jdoe"
> export HADOOP_HDFS_HOME="/usr/lib/hadoop-hdfs"
> export
> PWD="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/appli
> cation_1383247324024_0005/container_1383247324024_0005_01_000001"
> export NM_PORT="36276"
> export HOME="/home/"
> export LOGNAME="jdoe"
> export APP_SUBMIT_TIME_ENV="1383312862021"
> export HADOOP_CONF_DIR="/etc/hadoop/conf"
> export MALLOC_ARENA_MAX="4"
> export AM_CONTAINER_ID="container_1383247324024_0005_01_000001"
> ln -sf
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/filecache/-300930022458385182/job.jar" "job.jar"
> mkdir -p jobSubmitDir
> ln -sf
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/filecache/-4297161085730400838/job.splitmetainfo"
> "jobSubmitDir/job.splitmetainfo"
> mkdir -p jobSubmitDir
> ln -sf
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/filecache/-3754219748389402012/job.split"
> "jobSubmitDir/job.split"
> ln -sf
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/filecache/233482461420248540/job.xml" "job.xml"
> mkdir -p jobSubmitDir
> ln -sf
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/filecache/-8903348211231085224/appTokens"
> "jobSubmitDir/appTokens"
> exec /bin/bash -c "$JAVA_HOME/bin/java
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.mapreduce.container.log.dir=/tmp/hadoop-yarn/containers/applicat
> ion_1383247324024_0005/container_1383247324024_0005_01_000001
> -Dyarn.app.mapreduce.container.log.filesize=0
> -Dhadoop.root.logger=INFO,CLA  -Xmx1024m
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> 1>/tmp/hadoop-yarn/containers/application_1383247324024_0005/container_1383
> 247324024_0005_01_000001/stdout
> 2>/tmp/hadoop-yarn/containers/application_1383247324024_0005/container_1383
> 247324024_0005_01_000001/stderr  "
>
> # cat
> /tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicatio
> n_1383247324024_0005/container_1383247324024_0005_01_000001/default_contain
> er_executor.sh
> #!/bin/bash
>
> echo $$ >
> /tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
> 0005_01_000001.pid.tmp
> /bin/mv -f
> /tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
> 0005_01_000001.pid.tmp
> /tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
> 0005_01_000001.pid
> exec setsid /bin/bash
> "/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
> on_1383247324024_0005/container_1383247324024_0005_01_000001/launch_contain
> er.sh"
>
>
>
> yarn-site.xml
> ...
> <property>
>    <name>yarn.nodemanager.container-executor.class</name>
>
> <value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</
> value>
>   </property>
>   <property>
>    <name>yarn.nodemanager.linux-container-executor.group</name>
>    <value>hadoop</value>
>   </property>
>
> hdfs-site.conf
> ...
> <property>
>   <name>dfs.permissions</name>
>   <value>true</value>
> </property>
>
>
> --
> Jonathan (Bill) Sparks
> Software Architecture
> Cray Inc.
>
>
>
>
>
> On 10/31/13 6:12 AM, "Harsh J" <harsh@cloudera.com> wrote:
>
>>In insecure mode the containers run as the daemon's owner, i.e.
>>"yarn". Since the LocalFileSystem implementation has no way to
>>impersonate any users (we don't run as root/etc.) it can create files
>>only as the "yarn" user. On HDFS, we can send the right username in as
>>a form of authentication, and its reflected on the created files.
>>
>>If you enable the LinuxContainerExecutor (or generally enable
>>security) then the containers run after being setuid'd to the
>>submitting user, and your files would appear with the right owner.
>>
>>On Wed, Oct 30, 2013 at 1:49 AM, Bill Sparks <jsparks@cray.com> wrote:
>>>
>>> I have a strange use case and I'm looking for some debugging help.
>>>
>>>
>>> Use Case:
>>>
>>> If I run the hadoop mapped example wordcount program and write the
>>>output
>>> to HDFS, the output directory has the correct ownership.
>>>
>>> E.g.
>>>
>>> hadoop jar
>>> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
>>> wordcount /user/jdoe/simple/HF.txt /users/jdoe/simple/outtest1
>>>
>>> hdfs dfs -ls simple
>>> Found 3 items
>>> drwxr-xr-x - jdoe supergroup 0 2013-10-25 21:26 simple/HF.out
>>> -rw-r--r-- 1 jdoe supergroup 610157 2013-10-25 21:21 simple/HF.txt
>>> drwxr-xr-x - jdoe supergroup 0 2013-10-29 14:50 simple/outtest1
>>>
>>> Where as if I write to a global filesystem my output directory is owned
>>>by
>>> yarn
>>>
>>>
>>> E.g.
>>>
>>> hadoop jar
>>> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
>>> wordcount /user/jdoe/simple/HF.txt file:///scratch/jdoe/outtest1
>>> ls -l /scratch/jdoe
>>> total 8
>>> drwxr-xr-x 2 root root 4096 Oct 28 23:26 logs
>>> drwxr-xr-x 2 yarn yarn 4096 Oct 28 23:23 outtest1
>>>
>>>
>>>
>>> I've looked at the container log files, and saw no errors. The only
>>>thing
>>> I can think of, is the user authentication mode is "files:ldap" and the
>>> nodemanager nodes do not have access to the corporate LDAP server so
>>>it's
>>> working of local /etc/shadow which does not have my credentials - so it
>>> might just default to "yarn".
>>>
>>> I did find the following warning:
>>>
>>> 2013-10-29 14:58:52,184 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=jdoe
>>> OPERATION=Container Finished -
>>> Succeeded       TARGET=ContainerImpl    RESULT=SUCCESS
>>>APPID=application_13830201365
>>> 44_0005 CONTAINERID=container_1383020136544_0005_01_000001
>>> ...
>>> 2013-10-29 14:58:53,062 WARN
>>>
>>>org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManag
>>>er
>>> Impl: Trying to stop unknown container
>>> container_1383020136544_0005_01_000001
>>> 2013-10-29 14:58:53,062 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger:
>>> USER=UnknownUser        IP=10.128.0.17  OPERATION=Stop Container
>>> Request TARGET=ContainerManagerImpl     RESULT=FAILURE
>>>DESCRIPTION=Trying to
>>> stop unknown
>>> container!      APPID=application_1383020136544_0005
>>>CONTAINERID=container_13830
>>> 20136544_0005_01_000001
>>>
>>>
>>>
>>>
>>> Thanks,
>>>    John
>>>
>>
>>
>>
>>--
>>Harsh J
>



-- 
Harsh J

Mime
View raw message