hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Map tasks execution
Date Sat, 12 Feb 2011 18:01:06 GMT
Hello,

On Sat, Feb 12, 2011 at 9:34 PM, Pedro Costa <psdc1978@gmail.com> wrote:
> Hi,
>
> 1 - When a Map task is taking too long to finish its process, the JT
> launches another Map task to process. This means that the task that
> was replaced is killed?

If a task times out, it is killed and rescheduled. If you're noticing
this in the final waves, it could be the speculative execution feature
of Hadoop MapReduce - enabled by defaults.

> 2 - Does Hadoop MR allows that the same input split be processed by 2
> different mappers at the same time?

In some ways, yes.

There is a speculative execution feature that does this exact thing
(two tasks may be 'computing' in the same race - whichever reports a
completion first, wins). See the 'Speculative execution' sub-topic of
this YDN Hadoop modules page for some details:
http://developer.yahoo.com/hadoop/tutorial/module4.html#tolerence

But it should also be possible to have duplicated input splits / paths
in order to do this (although the 'same-time' is not a guarantee,
again).

-- 
Harsh J
www.harshj.com

Mime
View raw message