hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aayush <aayushgupta...@gmail.com>
Subject Re: Separating mapper intermediate files
Date Tue, 27 Mar 2012 12:18:17 GMT
Thanks Harsh.

I set the mapred.local.dir as you suggested. It creates 4 folders in it for jobtracker, tasktracker,
tt_private etc. i could not see an attempt directory. Can you let me know exactly where to
look in this directory structure?

Furthermore, it seems that all the intermediate spill and map output are cleaned up when the
mapper finishes. I want to see those intermediate files and  don't want the cleanup of these
files. How can I achieve it?

Thanks a lot

On Mar 27, 2012, at 1:16 AM, "Harsh J-2 [via Hadoop Common]"<ml-node+s472056n3860389h94@n3.nabble.com>
wrote:

> Hello Aayush, 
> 
> Three things that'd help clear your confusion: 
> 1. dfs.data.dir controls where HDFS blocks are to be stored. Set this 
> to a partition1 path. 
> 2. mapred.local.dir controls where intermediate task data go to. Set 
> this to a partition2 path. 
> 
> > Furthermore, can someone also tell me how to save intermediate mapper 
> > files(spill outputs) and where are they saved. 
> 
> Intermediate outputs are handled by the framework itself (There is no 
> user/manual work involved), and are saved inside attempt directories 
> under mapred.local.dir. 
> 
> On Tue, Mar 27, 2012 at 4:46 AM, aayush <[hidden email]> wrote: 
> > I am a newbie to Hadoop and map reduce. I am running a single node hadoop 
> > setup. I have created 2 partitions on my HDD. I want the mapper intermediate 
> > files (i.e. the spill files and the mapper output) to be sent to a file 
> > system on Partition1 whereas everything else including HDFS should be run on 
> > partition2. I am struggling to find the appropriate parametes in the conf 
> > files. I understand that there is hadoop.tmp.dir and mapred.local.dir but am 
> > not sure how to use what. I would really appreciate if someone could tell me 
> > exactly which parameters to modify to achieve the goal. 
> 
> -- 
> Harsh J 
> 
> 
> If you reply to this email, your message will be added to the discussion below:
> http://hadoop-common.472056.n3.nabble.com/Separating-mapper-intermediate-files-tp3859787p3860389.html
> To unsubscribe from Separating mapper intermediate files, click here.
> NAML


--
View this message in context: http://hadoop-common.472056.n3.nabble.com/Separating-mapper-intermediate-files-tp3859787p3861159.html
Sent from the Users mailing list archive at Nabble.com.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message