nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anton Potehin" <an...@orbita1.ru>
Subject mapred.map.tasks
Date Thu, 20 Apr 2006 06:56:24 GMT
<property>

  <name>mapred.map.tasks</name>

  <value>2</value>

  <description>The default number of map tasks per job.  Typically set

  to a prime several times greater than number of available hosts.

  Ignored when mapred.job.tracker is "local".  

  </description>

</property>

 

We have a question on this property. Is it really preferred to set this
parameter several times greater than number of available hosts? We do
not understand why it should be so? 

Our spider is distributed among 3 machines. What value is most preferred
for this parameter in our case? Which other factors may have effect on
most preferred value of this parameter?  

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message