hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: conf.setNumReduceTasks(1) but the code called 3 times
Date Wed, 29 Jul 2009 14:41:57 GMT
On Wed, Jul 29, 2009 at 12:58 AM, Mark Kerzner<markkerzner@gmail.com> wrote:
> Hi,
> I set the number of reducers to 1, and I indeed get only one output
> file, /output/part-00000.
> However, in configure() and in close() I do a System.out, and I see that
> these are called three times, not one.
> Why does it matter to me? In configure I open a zip file, into which I write
> the binary parts of my maps, and in close() I close it. I would expect this
> to be called just once, producing one zip file, but instead it is called
> three (and 2 when running from IDE), so it produces 3 zip files. I have to
> play games so that the names of the zip files don't collide - and I am not
> sure if this is stable.
> What am I missing in my understanding?
> Thank you,
> Mark

You should take a look at all the %speculative%execution properties

The cause multiple copies of the same map/reduce to be executed to try
to deal with slow mappers. In applications like web/ftp fetching or
file/database writing you probably want these off.

View raw message