hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Ortiz <mlor...@uci.cu>
Subject Re: issue with permissions of mapred.system.dir
Date Wed, 10 Oct 2012 00:07:36 GMT

On 10/09/2012 07:44 PM, Goldstone, Robin J. wrote:
> I am bringing up a Hadoop cluster for the first time (but am an 
> experienced sysadmin with lots of cluster experience) and running into 
> an issue with permissions on mapred.system.dir. It has generally been 
> a chore to figure out all the various directories that need to be 
> created to get Hadoop working, some on the local FS, others within 
> HDFS, getting the right ownership and permissions, etc..  I think I am 
> mostly there but can't seem to get past my current issue with 
> mapred.system.dir.
>
> Some general info first:
> OS: RHEL6
> Hadoop version: hadoop-1.0.3-1.x86_64
>
> 20 node cluster configured as follows
> 1 node as primary namenode
> 1 node as secondary namenode + job tracker
> 18 nodes as datanode + tasktracker
>
> I have HDFS up and running and have the following in mapred-site.xml:
> <property>
>   <name>mapred.system.dir</name>
>   <value>hdfs://hadoop1/mapred</value>
>   <description>Shared data for JT - this must be in HDFS</description>
> </property>
>
> I have created this directory in HDFS, owner mapred:hadoop, 
> permissions 700 which seems to be the most common recommendation 
> amongst multiple, often conflicting articles about how to set up 
> Hadoop.  Here is the top level of my filesystem:
> hyperion-hdp4@hdfs:hadoop fs -ls /
> Found 3 items
> drwx------   - mapred hadoop          0 2012-10-09 12:58 /mapred
> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>
> Note, it doesn't seem to really matter what permissions I set on 
> /mapred since when the Jobtracker starts up it changes them to 700.
>
> However, when I try to run the hadoop example teragen program as a 
> "regular" user I am getting this error:
> hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar 
> teragen -D dfs.block.size=536870912 10000000000 
> /user/robing/terasort-input
> Generating 10000000000 using 2 maps with step of 5000000000
> 12/10/09 16:27:02 INFO mapred.JobClient: Running job: 
> job_201210072045_0003
> 12/10/09 16:27:03 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/09 16:27:03 INFO mapred.JobClient: Job complete: 
> job_201210072045_0003
> 12/10/09 16:27:03 INFO mapred.JobClient: Counters: 0
> 12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job 
> initialization failed:
> org.apache.hadoop.security.AccessControlException: 
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=robing, access=EXECUTE, inode="mapred":mapred:hadoop:rwx------
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
> at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3251)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
> at 
> org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
> at 
> org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537)
> at 
> org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207)
> at 
> org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> <rest of stack trace omitted>
>
> This seems to be saying that is trying to write to the HDFS /mapred 
> filesystem as me (robing) rather than as mapred, the username under 
> which the jobtracker and tasktracker run.
>
> To verify this is what is happening, I manually changed the 
> permissions on /mapred from 700 to 755 since it claims to want execute 
> access:
> hyperion-hdp4@mapred:hadoop fs -chmod 755 /mapred
> hyperion-hdp4@mapred:hadoop fs -ls /
> Found 3 items
> drwxr-xr-x   - mapred hadoop          0 2012-10-09 12:58 /mapred
> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
> hyperion-hdp4@mapred:
>
> Now I try running again and it fails again, this time complaining it 
> wants write access to /mapred:
> hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar 
> teragen -D dfs.block.size=536870912 10000000000 
> /user/robing/terasort-input
> Generating 10000000000 using 2 maps with step of 5000000000
> 12/10/09 16:31:29 INFO mapred.JobClient: Running job: 
> job_201210072045_0005
> 12/10/09 16:31:30 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/09 16:31:30 INFO mapred.JobClient: Job complete: 
> job_201210072045_0005
> 12/10/09 16:31:30 INFO mapred.JobClient: Counters: 0
> 12/10/09 16:31:30 INFO mapred.JobClient: Job Failed: Job 
> initialization failed:
> org.apache.hadoop.security.AccessControlException: 
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=robing, access=WRITE, inode="mapred":mapred:hadoop:rwxr-xr-x
>
> So I changed the permissions on /mapred to 777:
> hyperion-hdp4@mapred:hadoop fs -chmod 777 /mapred
> hyperion-hdp4@mapred:hadoop fs -ls /
> Found 3 items
> drwxrwxrwx   - mapred hadoop          0 2012-10-09 12:58 /mapred
> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
> hyperion-hdp4@mapred:
>
> And then I run again and this time it works.
> yperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar 
> teragen -D dfs.block.size=536870912 10000000000 
> /user/robing/terasort-input
> Generating 10000000000 using 2 maps with step of 5000000000
> 12/10/09 16:33:02 INFO mapred.JobClient: Running job: 
> job_201210072045_0006
> 12/10/09 16:33:03 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/09 16:34:34 INFO mapred.JobClient:  map 1% reduce 0%
> 12/10/09 16:35:52 INFO mapred.JobClient:  map 2% reduce 0%
> etc…
>
> And indeed I can see that there is stuff written to /mapred under my 
> userid:
> # hyperion-hdp4 /root > hadoop fs -ls /mapred
> Found 2 items
> drwxrwxrwx   - robing hadoop          0 2012-10-09 16:33 
> /mapred/job_201210072045_0006
> -rw-------   2 mapred hadoop          4 2012-10-09 12:58 
> /mapred/jobtracker.info
>
> However,man ally setting the permissions to 777 is not a workable 
> solution since any time I restart the jobtracker, it is setting the 
> permissions on /mapred back to 700.
>  hyperion-hdp3 /root > hadoop fs -ls /
> Found 3 items
> drwxrwxrwx   - mapred hadoop          0 2012-10-09 16:33 /mapred
> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
> # hyperion-hdp3 /root > /etc/init.d/hadoop-jobtracker restart
> Stopping Hadoop jobtracker daemon (hadoop-jobtracker): stopping jobtracker
>  [  OK  ]
> Starting Hadoop jobtracker daemon (hadoop-jobtracker): starting 
> jobtracker, logging to 
> /var/log/hadoop/mapred/hadoop-mapred-jobtracker-hyperion-hdp3.out
>  [  OK  ]
> # hyperion-hdp3 /root > hadoop fs -ls /
> Found 3 items
> drwx------   - mapred hadoop          0 2012-10-09 16:38 /mapred
> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
> # hyperion-hdp3 /root >
>
> So my questions are:
>
>  1. What are the right permissions on mapred.system.dir?
>
Why are you using a HDFS-based directory for mapred.system.dir?
If you want to share this directory, a good approach to this, is to use 
NFS for
that.

>  1. If not 700, how do I get the job tracker to stop changing them to 700?
>  2. If 700 is correct, then what am I doing wrong in my attempt to run
>     the example teragen program?
>
>
> Thank you in advance.
> Robin Goldstone, LLNL
>
>
>
> <http://www.uci.cu/>

-- 

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
Tumblr's blog <http://marcosortiz.tumblr.com/>
@marcosluis2186 <http://twitter.com/marcosluis2186>



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci
Mime
View raw message