hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maha <m...@umail.ucsb.edu>
Subject Quick Question: LineSplit or BlockSplit
Date Mon, 07 Feb 2011 23:38:12 GMT
Hi,

  I would appreciate it if you could give me your thoughts if there is affect on efficiency
if:

  1) Mappers were per line in a document
 
  or 

  2) Mappers were per block of lines in a document.


 I know the obvious difference I can see is that (1) has more mappers. Does that mean (1)
will be slower because of scheduling time ?

Thank you,
Maha
 
Mime
View raw message