hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "HowManyMapsAndReduces" by LohitVijayarenu
Date Thu, 16 Aug 2007 23:37:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by LohitVijayarenu:
http://wiki.apache.org/lucene-hadoop/HowManyMapsAndReduces

------------------------------------------------------------------------------
  
  == Number of Reduces ==
  
- The right number of reduces seems to be between 1.0 to 1.75 * (nodes * mapred.tasktracker.tasks.maximum).
At 1.0 all of the reduces can launch immediately and start transfering map outputs as the
maps finish. At 1.75 the faster nodes will finish their first round of reduces and launch
a second round of reduces doing a much better job of load balancing.
+ The right number of reduces seems 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum).
At 0.95 all of the reduces can launch immediately and start transfering map outputs as the
maps finish. At 1.75 the faster nodes will finish their first round of reduces and launch
a second round of reduces doing a much better job of load balancing.
  
  Currently the number of reduces is limited to roughly 1000 by the buffer size for the output
files (io.buffer.size * 2 * numReduces << heapSize). This will be fixed at some point,
but until it is it provides a pretty firm upper bound.
  

Mime
View raw message