hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Marz <nat...@rapleaf.com>
Subject _temporary directories not deleted
Date Tue, 04 Nov 2008 19:46:26 GMT
Hello all,

Occasionally when running jobs, Hadoop fails to clean up the  
"_temporary" directories it has left behind. This only appears to  
happen when a task is killed (aka a speculative execution), and the  
data that task has outputted so far is not cleaned up. Is this a known  
issue in hadoop? Is the data from that task guaranteed to be duplicate  
data of what was outputted by another task? Is it safe to just delete  
this directory without worrying about losing data?

Nathan Marz

View raw message