hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hairong Kuang <hair...@yahoo-inc.com>
Subject Re: mapreduce does the wrong thing with dfs permissions?
Date Tue, 26 Feb 2008 06:57:24 GMT
One solution is to change the ownership or rwx permissions of the input
files. If you log in as the user that starts the namenode, you become the
super user of dfs. Then you have the permission to change any file's
ownership and permissions.


On 2/25/08 4:27 PM, "Michael Bieniosek" <michael@powerset.com> wrote:

> Hi,
> 
> I upgraded the cluster to hadoop-0.16 and immediately noticed that nothing
> worked because of the new dfs permissions.
> 
> Here's our situation:
> 
> - namenode, jobtracker, datanodes run as UNIX 'hadoop' account.
> - jobs are submitted to hadoop by rpc front-end running as 'hadoop' user
> - tasktrackers run as a limited 'mapreduce' account.
> 
> When the tasktrackers try to read/write DFS, they hit a permission error,
> because they are running as 'mapreduce' user, but everything is owned by
> 'hadoop' user.
> 
> Eventually I'd like to get to a scenario where our rpc front-end tells
> hadoop the real name of the submitting user, which is not the same as the
> UNIX account that the rpc front-end uses.
> 
> I think the problem here is that the dfs assumes that the DFS user should be
> the same as the UNIX user.  This won't work in clusters where different UNIX
> accounts can submit mapreduce jobs, because the only way to change the UNIX
> account is to run the tasktracker as root to begin with (which I don't want
> to do).
> 
> So, I can obviously turn off permissions, or set up the supergroup so that
> they don't matter.  But permissions seem like a good idea, so it's a shame
> to have to do this.
> 
> Instead, I'd like a model where the hadoop user for purposes of DFS
> permissions is completely separated from the UNIX account.  So, when I
> submit a job, I should be able to specify the hadoop user for the job.  The
> tasktrackers, which still have to run as the limited 'mapreduce' UNIX
> account, can remember to make all their DFS accesses as the hadoop user I
> specified when I started the mapreduce job.
> 
> Or am I missing something?
> 
> Thanks,
> Michael
> 


Mime
View raw message