hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arpit Gupta <ar...@hortonworks.com>
Subject Re: issue with permissions of mapred.system.dir
Date Wed, 10 Oct 2012 16:35:08 GMT
Robin

I will try to investigate the issue with fair scheduler. Do let us know if switching to default
or the capacity scheduler solved the issue.

--
Arpit Gupta
Hortonworks Inc.
http://hortonworks.com/

On Oct 10, 2012, at 9:32 AM, "Goldstone, Robin J." <goldstone1@llnl.gov> wrote:

> There is no /hadoop1 directory.  It is //hadoop1 which is the name of the server running
the name node daemon:
> 
>> <value>hdfs://hadoop1/mapred</value>
> 
> Per offline conversations with Arpit, it appears this problem is related to the fact
that I am using the fair scheduler.  The fair scheduler is designed to run map reduce jobs
as the user, rather than under the mapred username.  Apparently there are some issues with
this scheduler related to permissions on certain directories not allowing other users to execute/write
in places that are necessary for the job to run.  I haven't yet tried Arpit's suggestion to
switch to the task scheduler but I imagine it will resolve my issue, at least for now.  Ultimately
I do want to use the fair scheduler, as multi-tenancy is a key requirement for our Hadoop
deployment.
> 
> From: Manu S <manupkd87@gmail.com>
> Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Date: Wednesday, October 10, 2012 3:34 AM
> To: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Subject: Re: issue with permissions of mapred.system.dir
> 
> What is the permission for /hadoop1 dir in HDFS? Is "mapred" user have permission on
the same directory?
> 
> Thanks,
> Manu S
> 
> On Wed, Oct 10, 2012 at 5:52 AM, Arpit Gupta <arpit@hortonworks.com> wrote:
>> what is your "mapreduce.jobtracker.staging.root.dir" set to. This is a directory
that needs to be writable by the user and is is recommended to be set to "/user" so it writes
in appropriate users home directory.
>> 
>> --
>> Arpit Gupta
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> On Oct 9, 2012, at 4:44 PM, "Goldstone, Robin J." <goldstone1@llnl.gov> wrote:
>> 
>>> I am bringing up a Hadoop cluster for the first time (but am an experienced sysadmin
with lots of cluster experience) and running into an issue with permissions on mapred.system.dir.
  It has generally been a chore to figure out all the various directories that need to be
created to get Hadoop working, some on the local FS, others within HDFS, getting the right
ownership and permissions, etc..  I think I am mostly there but can't seem to get past my
current issue with mapred.system.dir.
>>> 
>>> Some general info first:
>>> OS: RHEL6
>>> Hadoop version: hadoop-1.0.3-1.x86_64
>>> 
>>> 20 node cluster configured as follows
>>> 1 node as primary namenode
>>> 1 node as secondary namenode + job tracker
>>> 18 nodes as datanode + tasktracker
>>> 
>>> I have HDFS up and running and have the following in mapred-site.xml:
>>> <property>
>>>   <name>mapred.system.dir</name>
>>>   <value>hdfs://hadoop1/mapred</value>
>>>   <description>Shared data for JT - this must be in HDFS</description>
>>> </property>
>>> 
>>> I have created this directory in HDFS, owner mapred:hadoop, permissions 700 which
seems to be the most common recommendation amongst multiple, often conflicting articles about
how to set up Hadoop.  Here is the top level of my filesystem:
>>> hyperion-hdp4@hdfs:hadoop fs -ls /
>>> Found 3 items
>>> drwx------   - mapred hadoop          0 2012-10-09 12:58 /mapred
>>> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
>>> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>>> 
>>> Note, it doesn't seem to really matter what permissions I set on /mapred since
when the Jobtracker starts up it changes them to 700.  
>>> 
>>> However, when I try to run the hadoop example teragen program as a "regular"
user I am getting this error:
>>> hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
>>> Generating 10000000000 using 2 maps with step of 5000000000
>>> 12/10/09 16:27:02 INFO mapred.JobClient: Running job: job_201210072045_0003
>>> 12/10/09 16:27:03 INFO mapred.JobClient:  map 0% reduce 0%
>>> 12/10/09 16:27:03 INFO mapred.JobClient: Job complete: job_201210072045_0003
>>> 12/10/09 16:27:03 INFO mapred.JobClient: Counters: 0
>>> 12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job initialization failed:
>>> org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException:
Permission denied: user=robing, access=EXECUTE, inode="mapred":mapred:hadoop:rwx------
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>> at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
>>> at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>>> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3251)
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
>>> at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
>>> at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
>>> at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3537)
>>> at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
>>> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207)
>>> at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:291)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> at java.lang.Thread.run(Thread.java:662)
>>> <rest of stack trace omitted>
>>> 
>>> This seems to be saying that is trying to write to the HDFS /mapred filesystem
as me (robing) rather than as mapred, the username under which the jobtracker and tasktracker
run.
>>> 
>>> To verify this is what is happening, I manually changed the permissions on /mapred
from 700 to 755 since it claims to want execute access:
>>> hyperion-hdp4@mapred:hadoop fs -chmod 755 /mapred
>>> hyperion-hdp4@mapred:hadoop fs -ls /
>>> Found 3 items
>>> drwxr-xr-x   - mapred hadoop          0 2012-10-09 12:58 /mapred
>>> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
>>> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>>> hyperion-hdp4@mapred:
>>> 
>>> Now I try running again and it fails again, this time complaining it wants write
access to /mapred:
>>> hyperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
>>> Generating 10000000000 using 2 maps with step of 5000000000
>>> 12/10/09 16:31:29 INFO mapred.JobClient: Running job: job_201210072045_0005
>>> 12/10/09 16:31:30 INFO mapred.JobClient:  map 0% reduce 0%
>>> 12/10/09 16:31:30 INFO mapred.JobClient: Job complete: job_201210072045_0005
>>> 12/10/09 16:31:30 INFO mapred.JobClient: Counters: 0
>>> 12/10/09 16:31:30 INFO mapred.JobClient: Job Failed: Job initialization failed:
>>> org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException:
Permission denied: user=robing, access=WRITE, inode="mapred":mapred:hadoop:rwxr-xr-x
>>> 
>>> So I changed the permissions on /mapred to 777:
>>> hyperion-hdp4@mapred:hadoop fs -chmod 777 /mapred
>>> hyperion-hdp4@mapred:hadoop fs -ls /
>>> Found 3 items
>>> drwxrwxrwx   - mapred hadoop          0 2012-10-09 12:58 /mapred
>>> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
>>> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>>> hyperion-hdp4@mapred:
>>> 
>>> And then I run again and this time it works.
>>> yperion-hdp4@robing:hadoop jar /usr/share/hadoop/hadoop-examples*.jar teragen
-D dfs.block.size=536870912 10000000000 /user/robing/terasort-input
>>> Generating 10000000000 using 2 maps with step of 5000000000
>>> 12/10/09 16:33:02 INFO mapred.JobClient: Running job: job_201210072045_0006
>>> 12/10/09 16:33:03 INFO mapred.JobClient:  map 0% reduce 0%
>>> 12/10/09 16:34:34 INFO mapred.JobClient:  map 1% reduce 0%
>>> 12/10/09 16:35:52 INFO mapred.JobClient:  map 2% reduce 0%
>>> etc…
>>> 
>>> And indeed I can see that there is stuff written to /mapred under my userid:
>>> # hyperion-hdp4 /root > hadoop fs -ls /mapred
>>> Found 2 items
>>> drwxrwxrwx   - robing hadoop          0 2012-10-09 16:33 /mapred/job_201210072045_0006
>>> -rw-------   2 mapred hadoop          4 2012-10-09 12:58 /mapred/jobtracker.info
>>> 
>>> However,man ally setting the permissions to 777 is not a workable solution since
any time I restart the jobtracker, it is setting the permissions on /mapred back to 700. 

>>>  hyperion-hdp3 /root > hadoop fs -ls /
>>> Found 3 items
>>> drwxrwxrwx   - mapred hadoop          0 2012-10-09 16:33 /mapred
>>> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
>>> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>>> # hyperion-hdp3 /root > /etc/init.d/hadoop-jobtracker restart
>>> Stopping Hadoop jobtracker daemon (hadoop-jobtracker): stopping jobtracker
>>>                                                            [  OK  ]
>>> Starting Hadoop jobtracker daemon (hadoop-jobtracker): starting jobtracker, logging
to /var/log/hadoop/mapred/hadoop-mapred-jobtracker-hyperion-hdp3.out
>>>                                                            [  OK  ]
>>> # hyperion-hdp3 /root > hadoop fs -ls /
>>> Found 3 items
>>> drwx------   - mapred hadoop          0 2012-10-09 16:38 /mapred
>>> drwxrwxrwx   - hdfs   hadoop          0 2012-10-09 13:00 /tmp
>>> drwxr-xr-x   - hdfs   hadoop          0 2012-10-09 12:51 /user
>>> # hyperion-hdp3 /root > 
>>> 
>>> So my questions are:
>>> What are the right permissions on mapred.system.dir?
>>> If not 700, how do I get the job tracker to stop changing them to 700?
>>> If 700 is correct, then what am I doing wrong in my attempt to run the example
teragen program?
>>> 
>>> Thank you in advance.
>>> Robin Goldstone, LLNL
>>> 
>> 
> 


Mime
View raw message