hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prashant Kommireddi <prash1...@gmail.com>
Subject Re: Doubt from the book "Definitive Guide"
Date Thu, 05 Apr 2012 05:02:52 GMT
Hi Mohit,

What would be the advantage? Reducers in most cases read data from all
the mappers. In the case where mappers were to write to HDFS, a
reducer would still require to read data from other datanodes across
the cluster.


On Apr 4, 2012, at 9:55 PM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:

> On Wed, Apr 4, 2012 at 8:42 PM, Harsh J <harsh@cloudera.com> wrote:
>> Hi Mohit,
>> On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia <mohitanchlia@gmail.com>
>> wrote:
>>> I am going through the chapter "How mapreduce works" and have some
>>> confusion:
>>> 1) Below description of Mapper says that reducers get the output file
>> using
>>> HTTP call. But the description under "The Reduce Side" doesn't
>> specifically
>>> say if it's copied using HTTP. So first confusion, Is the output copied
>>> from mapper -> reducer or from reducer -> mapper? And second, Is the call
>>> http:// or hdfs://
>> The flow is simple as this:
>> 1. For M+R job, map completes its task after writing all partitions
>> down into the tasktracker's local filesystem (under mapred.local.dir
>> directories).
>> 2. Reducers fetch completion locations from events at JobTracker, and
>> query the TaskTracker there to provide it the specific partition it
>> needs, which is done over the TaskTracker's HTTP service (50060).
>> So to clear things up - map doesn't send it to reduce, nor does reduce
>> ask the actual map task. It is the task tracker itself that makes the
>> bridge here.
>> Note however, that in Hadoop 2.0 the transfer via ShuffleHandler would
>> be over Netty connections. This would be much more faster and
>> reliable.
>>> 2) My understanding was that mapper output gets written to hdfs, since
>> I've
>>> seen part-m-00000 files in hdfs. If mapper output is written to HDFS then
>>> shouldn't reducers simply read it from hdfs instead of making http calls
>> to
>>> tasktrackers location?
>> A map-only job usually writes out to HDFS directly (no sorting done,
>> cause no reducer is involved). If the job is a map+reduce one, the
>> default output is collected to local filesystem for partitioning and
>> sorting at map end, and eventually grouping at reduce end. Basically:
>> Data you want to send to reducer from mapper goes to local FS for
>> multiple actions to be performed on them, other data may directly go
>> to HDFS.
>> Reducers currently are scheduled pretty randomly but yes their
>> scheduling can be improved for certain scenarios. However, if you are
>> pointing that map partitions ought to be written to HDFS itself (with
>> replication or without), I don't see performance improving. Note that
>> the partitions aren't merely written but need to be sorted as well (at
>> either end). To do that would need ability to spill frequently (cause
>> we don't have infinite memory to do it all in RAM) and doing such a
>> thing on HDFS would only mean slowdown.
>> Thanks for clearing my doubts. In this case I was merely suggesting that
> if the mapper output (merged output in the end or the shuffle output) is
> stored in HDFS then reducers can just retrieve it from HDFS instead of
> asking tasktracker for it. Once reducer threads read it they can continue
> to work locally.
>> I hope this helps clear some things up for you.
>> --
>> Harsh J

View raw message