hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Macdonald <cra...@dcs.gla.ac.uk>
Subject FileOutputFormat.getWorkOutputPath and map-to-reduce-only side-effect files
Date Thu, 22 Jan 2009 18:49:09 GMT
Hello Hadoop Core,

I have a very brief question: Our map tasks create side-effect files, in 
the directory returned by FileOutputFormat.getWorkOutputPath().

This works fine for the getting the side-effect files that can be 
accessed by the reducers.

However, as these map-generated side-effect files are only of use to the 
reducers, it would be nice to have them deleted from the output 
directory. However, we cant delete them in a reducer.close(), as this 
would prevent them being accessible to other reduce tasks (speculative 
or otherwise).

Any suggestions, short of deleting them after the job completes?


View raw message