hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Kerzner <markkerz...@gmail.com>
Subject Re: conf.setNumReduceTasks(1) but the code called 3 times
Date Wed, 29 Jul 2009 17:11:26 GMT
I think that was it or close: it now goes through my Reducer code only twice
instead of multiple times. I would like it to do it just once, but I can
perhaps live with that - after all, writing zip files by myself, outside of
hadoop paradigm may be not quite standard.
The second concern is - how to control this when executing on Amazon Map
Reduce? I could not find a way.

Thanks!

Mark

On Wed, Jul 29, 2009 at 9:41 AM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> On Wed, Jul 29, 2009 at 12:58 AM, Mark Kerzner<markkerzner@gmail.com>
> wrote:
> > Hi,
> > I set the number of reducers to 1, and I indeed get only one output
> > file, /output/part-00000.
> >
> > However, in configure() and in close() I do a System.out, and I see that
> > these are called three times, not one.
> >
> > Why does it matter to me? In configure I open a zip file, into which I
> write
> > the binary parts of my maps, and in close() I close it. I would expect
> this
> > to be called just once, producing one zip file, but instead it is called
> > three (and 2 when running from IDE), so it produces 3 zip files. I have
> to
> > play games so that the names of the zip files don't collide - and I am
> not
> > sure if this is stable.
> >
> > What am I missing in my understanding?
> >
> > Thank you,
> > Mark
> >
>
> You should take a look at all the %speculative%execution properties
>   <property>
>        <name>mapred.reduce.tasks.speculative.execution</name>
>        <value>false</value>
>    </property>
>
> The cause multiple copies of the same map/reduce to be executed to try
> to deal with slow mappers. In applications like web/ftp fetching or
> file/database writing you probably want these off.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message