hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Is there a way to keep all intermediate files there after the MapReduce Job run?
Date Fri, 01 Mar 2013 13:23:19 GMT
Your job.xml file is kept for a set period of time. 
I believe the others are automatically removed. 

You can easily access the job.xml file from the JT webpage.

On Mar 1, 2013, at 4:14 AM, Ling Kun <lkun.erlv@gmail.com> wrote:

> Dear all,
>     In order to know more about the files creation and size when the job is running,
I want to keep all the intermediate files there (job.xml, spillN.out, file.out, file.index,
map.out-N, etc).
> My question is :
> 1. Is there any configurations that can make this happen? Or could I modify some Hadoop
MapReduce code for this ?
> 2. Since each job, each task, and each attempt of the task using different  directories
to store all the intermediate files, keeping the files there without deleting will not hurt
the whole MapReduce cluster except taking up some storage. Am I write?
> Thanks 
> yours,
> Ling Kun
> -- 
> http://www.lingcc.com

View raw message