hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj k <devara...@huawei.com>
Subject RE: Execution directory for child process within mapper
Date Mon, 26 Sep 2011 19:19:49 GMT
Localized distributed cache also can be helpful here, if you can do necessary changes to your
code. It locates like this in local directory ${mapred.local.dir}/taskTracker/archive/.

As per your explanation, I feel you can write the mapper in such way that copy the files from
your customized location( /home/users/{user}/input/jobname) to the current working directory
and then start executing the executable. 

I hope this helps. :)

From: Joris Poort [gpoort@gmail.com]
Sent: Tuesday, September 27, 2011 12:25 AM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Execution directory for child process within mapper

Hi Devaraj,

Thanks for your help - that makes sense.  Is there any way to copy the
local files needed for execution to the mapred.local.dir?
Unfortunately I'm running a local code which I cannot edit - and this
code is the one which assumes these files are available in the same



On Mon, Sep 26, 2011 at 11:40 AM, Devaraj k <devaraj.k@huawei.com> wrote:
> Hi Joris,
> You cannot configure the work directory directly. You can configure the local directory
with property 'mapred.local.dir', and it will be used further to create the work directory
like '${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work'. Based on this, you can
relatively refer your local command to execute.
> I hope this page will help you to understand the directory structure clearly. http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Directory+Structure
> Thanks
> Devaraj
> ________________________________________
> From: Joris Poort [gpoort@gmail.com]
> Sent: Monday, September 26, 2011 11:20 PM
> To: mapreduce-user
> Subject: Execution directory for child process within mapper
> As part of my Java mapper I have a command executes some standalone
> code on a local slave node. When I run a code it executes fine, unless
> it is trying to access some local files in which case I get the error
> that it cannot locate those files.
> Digging a little deeper it seems to be executing from the following directory:
>    /data/hadoop/mapred/local/taskTracker/{user}/jobcache/job_201109261253_0023/attempt_201109261253_0023_m_000001_0/work
> But I am intending to execute from a local directory where the
> relevant files are located:
>    /home/users/{user}/input/jobname
> Is there a way in java/hadoop to force the execution from the local
> directory, instead of the jobcache directory automatically created in
> hadoop?
> Is there perhaps a better way to go about this?
> Any help on this would be greatly appreciated!
> Cheers,
> Joris

View raw message