hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Problems with LinuxTaskController, LocalJobRunner, and localRunner directory
Date Fri, 06 May 2011 18:49:47 GMT
Hi Jeremy,

That's a good point - we don't currently do a good job of segregating the
configurations used for the LJR from the configs used for the TaskTracker.
In particular I think both mapred.local.dir and mapred.system.dir are used
by both.

You run into the same issue when trying to use LJR on a system with a
configured cluster, even if not using the LinuxTaskController features.

I'd recommend making a separate hadoop conf/ directory with a different
setting for mapred.local.dir.

-Todd

On Fri, May 6, 2011 at 11:45 AM, <jeremy@lewi.us> wrote:

> Hi,
>
> I'm running hadoop (Cloudera release 3) in pseudo distributed mode, with
> the linux task controller so that jobs will run as the user who submitted
> them.
>
> My program (which uses hadoop cascading) fires off a job using
> LocalJobRunner (I think to read data from the local filesystem). So far so
> good.
> The job creates the directory
> /var/lib/hadoop-0.20/cache/pseudo/localRunner
> (/var/lib/hadoop-0.20/cache/pseudo being the value of mapred.local.dir)
>
> The problem is that localRunner isn't owned by the user mapred. Instead its
> owned by the user who submitted the job. The next time I restart the
> daemons, the task tracker will fail because it can't rename
> /var/lib/hadoop-0.20/cache/pseudo/localRunner.
>
> Does anybody have suggestions how to fix this?
>
> Thanks
> Jeremy
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message