hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Sparks <jspa...@cray.com>
Subject Re: Why is my output directory owned by yarn?
Date Fri, 01 Nov 2013 14:30:59 GMT
We'll I thought I've set all this up correctly and on the NodeManager
nodes can change to my user id, so general user authentication is working.
But still the output is written as yarn. I guess my question is how to
enable secure mode - I thought that was the default mode.

When the containers are written they contain the correct user name
(included).

cat 
/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicatio
n_1383247324024_0005/container_1383247324024_0005_01_000001/launch_containe
r.sh 
#!/bin/bash

export 
YARN_LOCAL_DIRS="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/ap
pcache/application_1383247324024_0005"
export NM_HTTP_PORT="8042"
export HADOOP_COMMON_HOME="/usr/lib/hadoop"
export JAVA_HOME="/opt/java/jdk1.6.0_20"
export HADOOP_YARN_HOME="/usr/lib/hadoop-yarn"
export NM_HOST="nid00031"
export 
CLASSPATH="$PWD:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/
lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_MAPRED_HOME/*:$HA
DOOP_MAPRED_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*:$HADOOP_
MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapre
duce/lib/*:job.jar/job.jar:job.jar/classes/:job.jar/lib/*:$PWD/*"
export 
HADOOP_TOKEN_FILE_LOCATION="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/userca
che/jdoe/appcache/application_1383247324024_0005/container_1383247324024_00
05_01_000001/container_tokens"
export APPLICATION_WEB_PROXY_BASE="/proxy/application_1383247324024_0005"
export JVM_PID="$$"
export USER="jdoe"
export HADOOP_HDFS_HOME="/usr/lib/hadoop-hdfs"
export 
PWD="/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/appli
cation_1383247324024_0005/container_1383247324024_0005_01_000001"
export NM_PORT="36276"
export HOME="/home/"
export LOGNAME="jdoe"
export APP_SUBMIT_TIME_ENV="1383312862021"
export HADOOP_CONF_DIR="/etc/hadoop/conf"
export MALLOC_ARENA_MAX="4"
export AM_CONTAINER_ID="container_1383247324024_0005_01_000001"
ln -sf 
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/filecache/-300930022458385182/job.jar" "job.jar"
mkdir -p jobSubmitDir
ln -sf 
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/filecache/-4297161085730400838/job.splitmetainfo"
"jobSubmitDir/job.splitmetainfo"
mkdir -p jobSubmitDir
ln -sf 
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/filecache/-3754219748389402012/job.split"
"jobSubmitDir/job.split"
ln -sf 
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/filecache/233482461420248540/job.xml" "job.xml"
mkdir -p jobSubmitDir
ln -sf 
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/filecache/-8903348211231085224/appTokens"
"jobSubmitDir/appTokens"
exec /bin/bash -c "$JAVA_HOME/bin/java
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.mapreduce.container.log.dir=/tmp/hadoop-yarn/containers/applicat
ion_1383247324024_0005/container_1383247324024_0005_01_000001
-Dyarn.app.mapreduce.container.log.filesize=0
-Dhadoop.root.logger=INFO,CLA  -Xmx1024m
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
1>/tmp/hadoop-yarn/containers/application_1383247324024_0005/container_1383
247324024_0005_01_000001/stdout
2>/tmp/hadoop-yarn/containers/application_1383247324024_0005/container_1383
247324024_0005_01_000001/stderr  "

# cat 
/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicatio
n_1383247324024_0005/container_1383247324024_0005_01_000001/default_contain
er_executor.sh 
#!/bin/bash

echo $$ > 
/tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
0005_01_000001.pid.tmp
/bin/mv -f 
/tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
0005_01_000001.pid.tmp
/tmp/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1383247324024_
0005_01_000001.pid
exec setsid /bin/bash
"/tmp/hadoop-yarn/cache/yarn/nm-local-dir/usercache/jdoe/appcache/applicati
on_1383247324024_0005/container_1383247324024_0005_01_000001/launch_contain
er.sh"



yarn-site.xml
...
<property>
   <name>yarn.nodemanager.container-executor.class</name>
   
<value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</
value>
  </property>
  <property>
   <name>yarn.nodemanager.linux-container-executor.group</name>
   <value>hadoop</value>
  </property>

hdfs-site.conf
...
<property>
  <name>dfs.permissions</name>
  <value>true</value>
</property>


-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.





On 10/31/13 6:12 AM, "Harsh J" <harsh@cloudera.com> wrote:

>In insecure mode the containers run as the daemon's owner, i.e.
>"yarn". Since the LocalFileSystem implementation has no way to
>impersonate any users (we don't run as root/etc.) it can create files
>only as the "yarn" user. On HDFS, we can send the right username in as
>a form of authentication, and its reflected on the created files.
>
>If you enable the LinuxContainerExecutor (or generally enable
>security) then the containers run after being setuid'd to the
>submitting user, and your files would appear with the right owner.
>
>On Wed, Oct 30, 2013 at 1:49 AM, Bill Sparks <jsparks@cray.com> wrote:
>>
>> I have a strange use case and I'm looking for some debugging help.
>>
>>
>> Use Case:
>>
>> If I run the hadoop mapped example wordcount program and write the
>>output
>> to HDFS, the output directory has the correct ownership.
>>
>> E.g.
>>
>> hadoop jar
>> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
>> wordcount /user/jdoe/simple/HF.txt /users/jdoe/simple/outtest1
>>
>> hdfs dfs -ls simple
>> Found 3 items
>> drwxr-xr-x - jdoe supergroup 0 2013-10-25 21:26 simple/HF.out
>> -rw-r--r-- 1 jdoe supergroup 610157 2013-10-25 21:21 simple/HF.txt
>> drwxr-xr-x - jdoe supergroup 0 2013-10-29 14:50 simple/outtest1
>>
>> Where as if I write to a global filesystem my output directory is owned
>>by
>> yarn
>>
>>
>> E.g.
>>
>> hadoop jar
>> /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
>> wordcount /user/jdoe/simple/HF.txt file:///scratch/jdoe/outtest1
>> ls -l /scratch/jdoe
>> total 8
>> drwxr-xr-x 2 root root 4096 Oct 28 23:26 logs
>> drwxr-xr-x 2 yarn yarn 4096 Oct 28 23:23 outtest1
>>
>>
>>
>> I've looked at the container log files, and saw no errors. The only
>>thing
>> I can think of, is the user authentication mode is "files:ldap" and the
>> nodemanager nodes do not have access to the corporate LDAP server so
>>it's
>> working of local /etc/shadow which does not have my credentials - so it
>> might just default to "yarn".
>>
>> I did find the following warning:
>>
>> 2013-10-29 14:58:52,184 INFO
>> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=jdoe
>> OPERATION=Container Finished -
>> Succeeded       TARGET=ContainerImpl    RESULT=SUCCESS
>>APPID=application_13830201365
>> 44_0005 CONTAINERID=container_1383020136544_0005_01_000001
>> ...
>> 2013-10-29 14:58:53,062 WARN
>> 
>>org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManag
>>er
>> Impl: Trying to stop unknown container
>> container_1383020136544_0005_01_000001
>> 2013-10-29 14:58:53,062 WARN
>> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger:
>> USER=UnknownUser        IP=10.128.0.17  OPERATION=Stop Container
>> Request TARGET=ContainerManagerImpl     RESULT=FAILURE
>>DESCRIPTION=Trying to
>> stop unknown
>> container!      APPID=application_1383020136544_0005
>>CONTAINERID=container_13830
>> 20136544_0005_01_000001
>>
>>
>>
>>
>> Thanks,
>>    John
>>
>
>
>
>-- 
>Harsh J


Mime
View raw message