hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Farhan Husain <farhan.hus...@csebuet.org>
Subject Relation between number of map tasks and input splits
Date Thu, 23 Sep 2010 20:41:36 GMT
Hello,

Can a map task work on more than one input split? I am using hadoop-0.20.1
and in my map method I need to know the name of the file I am getting input
from. I use the following code to get that:

String inputFile = ((FileSplit)
context.getInputSplit()).getPath().getName();

If a map works on only one input split then I can have that code in the
setup() method which would be more efficient if I am handling large amount
of data. Otherwise, I have to put the code in the map() method. But this
would slow me down as I have to do it for every input key value pair. I have
gone through the following two pages but did not get a clear picture:

http://wiki.apache.org/hadoop/HadoopMapReduce
http://wiki.apache.org/hadoop/HowManyMapsAndReduces

Thanks,
Farhan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message