hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: mapreduce does the wrong thing with dfs permissions?
Date Tue, 26 Feb 2008 22:11:12 GMT
It looks like the 'mapreduce' user does not have permission to list the 
job directory.  Can you provide 'ls' output of that directory?  Have you 
altered permission settings at all in your configuration?


Michael Bieniosek wrote:
> In this job, the namenode, the jobtracker, and the job submitter are all called "hadoop".
 The jobtracker display also indicates that the job is submitted by "hadoop".  The tasktracker
runs as the unix account "mapreduce".
> The job failed.  All the tasks have this error message:
> Error initializing task_200802230215_0001_m_000000_0:
> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.fs.permission.AccessControlException:
Permission denied: user=mapreduce, access=READ_EXECUTE, inode="job_200802230215_0001":hadoop:supergroup:rwx-wx-wx
>     at org.apache.hadoop.dfs.PermissionChecker.check(PermissionChecker.java:171)
>     at org.apache.hadoop.dfs.PermissionChecker.checkPermission(PermissionChecker.java:106)
>     at org.apache.hadoop.dfs.FSNamesystem.checkPermission(FSNamesystem.java:4016)
>     at org.apache.hadoop.dfs.FSNamesystem.checkPathAccess(FSNamesystem.java:3976)
>     at org.apache.hadoop.dfs.FSNamesystem.getListing(FSNamesystem.java:1810)
>     at org.apache.hadoop.dfs.NameNode.getListing(NameNode.java:433)
>     at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     at java.lang.reflect.Method.invoke(Unknown Source)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:409)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:910)
>     at org.apache.hadoop.ipc.Client.call(Client.java:512)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
>     at org.apache.hadoop.dfs.$Proxy5.getListing(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>     at java.lang.reflect.Method.invoke(Unknown Source)
>     at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>     at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>     at org.apache.hadoop.dfs.$Proxy5.getListing(Unknown Source)
>     at org.apache.hadoop.dfs.DFSClient.listPaths(DFSClient.java:439)
>     at org.apache.hadoop.dfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:165)
>     at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:620)
>     at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1287)
>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:928)
>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1323)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2197)
> On 2/26/08 1:17 PM, "Hairong Kuang" <hairong@yahoo-inc.com> wrote:
> Before you file a jira, could you please post the error message? Let us
> check what went wrong in your case. The design is that every job talks to
> dfs as the user who submitted the job. Please check
> https://issues.apache.org/jira/browse/HADOOP-1873 for more information on
> user permissions and mapred.
> Hairong
> On 2/26/08 12:08 PM, "Michael Bieniosek" <michael@powerset.com> wrote:
>> This is not the behavior I was seeing -- to use your example, the tasktracker
>> tried to talk to the the DFS as the "foo" user, not the "bar" user who
>> submitted the job.  Should I file a JIRA then?
>> -Michael
>> On 2/26/08 11:13 AM, "s29752-hadoopdev@yahoo.com" <s29752-hadoopdev@yahoo.com>
>> wrote:
>>> The problem is that the tasktrackers always run under the same UNIX account,
>>> "mapreduce".  I can submit a job as "user", but the tasktracker will still
>>> talk to the dfs as the "mapreduce" user.  This means that everything that
>>> hadoop mapreduce touches has to be owned in the dfs by the "mapreduce" user.
>>> If everything is owned and run by the same user, then permissions are
>>> pointless.
>> I am not quite understand your situation but the tasktracker account should
>> not matter.  Suppose a tasktracker is ran by foo and a job is submitted by
>> bar.  Then, the permission checking during the execution of the job is against
>> the job submitter (bar), not tasktracker (foo).  In your case, if the job is
>> submitted by "user" and "user" is able to read the input files and access
>> other required files, than you should not get any AccessControlException.
>> Nicholas

View raw message