hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Voigt...@123.org>
Subject Re: a question on NameNode
Date Mon, 19 Nov 2012 15:19:21 GMT

Am 19.11.2012 um 16:14 schrieb "Kartashov, Andy" <Andy.Kartashov@mpac.ca>:

> Does MapReduce run tasks of redundant blocks ?
> Say you have only 1 block of data replicated 3 times, one block over each of three DNodes,
block 1 – DN1 / block 1(replica #1) – DN2 / block1 (replica #2) – DN3
> Will MR attempt:
> a.       to start 3 Map tasks (one per replicated block) end execute them all
> b.      to start 3 Map tasks (one per replicated block) end drop the other two as soon
as one of the three executed successfully
> c.       will start only 1 Map task (for just one block avoiding all replicated ones)
and will attempt to start (another one of the replicated blocks) when and only when the initially
task running (say on DN1)failed

the JobTracker will schedule the map task on one node only initially. There's no need to launch
the task on all nodes that have a local copy of the block.

If a task fails during its execution (node failure, e.g.), the JobTracker will launch the
task again on another node with that block.

There's another advanced feature called Speculative Execution. If a task is progressing slowly
through a phase (maybe due to flaky hardware), the JobTracker will launch the task in parallel
on another node. The node finishing first will be used to get the task's output. The slow
task will be killed.


Kai Voigt

View raw message