hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod KV <vino...@yahoo-inc.com>
Subject Re: code for finding where the map outputs are transferred to file.
Date Wed, 18 Aug 2010 03:49:13 GMT

Moving mapreduce specific question to mapreduce-user@hadoop.apache.org

All map task related execution starts at org.apache.hadoop.mapred.MapTask.

For your specific question, you can see MapTask.runNewMapper() - > 
NewOutputCollector -> MapOutputBuffer.


On Tuesday 17 August 2010 04:17 PM, Rahul.V. wrote:
> Hi,
> Ive read that the intermediate map output is written to the disk at the
> regular intervals. Infact Ive read that there are background threads which
> spill the data onto disk whenever it crosses the threshold.[Source:Hadoop:
> The Definitive Guide.]
> Ive tried to dig into the code a couple of times to see where exactly this
> is happening. If any of you know where is it, can you kindly let me know the
> filename and package name where I can find it?

View raw message