hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Ebert <...@reflectivedevelopment.com>
Subject Re: increasing number of mappers.
Date Wed, 09 Nov 2011 18:46:28 GMT
I found the following works for me.

FileInputFormat.setMaxInputSplitSize(job, 10L * 1024L);

Kim

On 11/09/2011 04:11 AM, Radim Kolar wrote:
> I have 2 input seq files 32MB each. I want to run them on as many 
> mappers as possible.
>
> i appended  -D mapred.max.split.size=1000000 as command line argument 
> to job, but there is no difference. Job still runs on 2 mappers.
>
> How split size works? Is max split size used for reading or writing 
> files?
>
> it works like this?:  set maxsplitsize, write files and you will get 
> bunch of seq files as output. then you will get same number of mappers 
> as input files.
>


Mime
View raw message