hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: mapper and reducer scheduling
Date Mon, 01 Nov 2010 03:36:12 GMT

On Mon, Nov 1, 2010 at 8:19 AM, Zhenhua Guo <jenvor@gmail.com> wrote:
> Thanks!
> One more question. Is the input file replicated on each node where a
> mapper is run? Or just the portion processed by a mapper is
> transferred?

With the use of HDFS, this is what happens: Mappers are run on nodes
where the input file's blocks are already present [Data-local map
tasks]. If TaskTracker slots are unavailable on that node for the
mapper to run, it is run somewhere else and the input block ("portion
processed by a mapper") is fetched from one of the DataNodes in the
same rack [Rack-local map tasks].

Harsh J

View raw message