hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <...@sendmail.cz>
Subject increasing number of mappers.
Date Wed, 09 Nov 2011 11:11:33 GMT
I have 2 input seq files 32MB each. I want to run them on as many 
mappers as possible.

i appended  -D mapred.max.split.size=1000000 as command line argument to 
job, but there is no difference. Job still runs on 2 mappers.

How split size works? Is max split size used for reading or writing files?

it works like this?:  set maxsplitsize, write files and you will get 
bunch of seq files as output. then you will get same number of mappers 
as input files.

View raw message