hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tarandeep Singh" <tarand...@gmail.com>
Subject Question about ChainMapper and ChainReducer
Date Tue, 25 Nov 2008 18:27:46 GMT

I would like to know how does ChainMapper and ChainReducer save IO ?

The doc says the output of first mapper becomes the input of second and so
on. So does this mean, the output of first map is *not* written to HDFS and
a second map process is started that operates on the data generated by first
map only?

In other words, is it safe to assume that if a map1 ran on node1 and
produced D1 output, then this D1 is stored locally on node1 and a second map
process (from chained map job) operates only on this local D1?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message