hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sriramadasu <amar...@yahoo-inc.com>
Subject Re: intermediate files of killed tasks not purged
Date Tue, 28 Apr 2009 08:32:51 GMT
Hi Sandhya,

  Which version of HADOOP are you using? There could be <attempt_id> 
directories in mapred/local, pre 0.17. Now, there should not be any such 
directories.
 From version 0.17 onwards, the attempt directories will be present only 
at mapred/local/taskTracker/jobCache/<jobid>/<attempid> . If you are 
seeing the directories in any other location, then it seems like a bug.

HADOOP-4654 is to cleanup temporary data in DFS for failed tasks, it 
does not change local FileSystem files.

Thanks
Amareshwari
Edward J. Yoon wrote:
> Hi,
>
> It seems related with https://issues.apache.org/jira/browse/HADOOP-4654.
>
> On Tue, Apr 28, 2009 at 4:01 PM, Sandhya E <sandhyabhaskar@gmail.com> wrote:
>   
>> Hi
>>
>> Under <hadoop-tmp-dir>/mapred/local there are directories like
>> "attempt_200904262046_0026_m_000002_0"
>> Each of these directories contains files of format: intermediate.1
>> intermediate.2  intermediate.3  intermediate.4  intermediate.5
>> There are many directories in this format. All these correspond to
>> killed task attempts. As they contain huge intermediate files, we
>> landed up in disk space issues.
>>
>> They are cleaned up  when mapred cluster is restarted. But otherwise,
>> how can these be cleaned up without having to restart cluster.
>>
>> Conf parameter "keep.failed.task.files" is set to "false" in our case.
>>
>> Many Thanks
>> Sandhya
>>
>>     
>
>
>
>   


Mime
View raw message