hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CubicDesign <cubicdes...@gmail.com>
Subject Re: Processing 10MB files in Hadoop
Date Thu, 26 Nov 2009 15:39:10 GMT

> The number of mapper is determined by your InputFormat.
> In common case, if file is smaller than one block size (which is 64M by
> default), one mapper for this file. if file is larger than one block size,
> hadoop will split this large file, and the number of mapper for this file
> will be ceiling ( (size of file)/(size of block) )

Do you mean, I should set the number of map tasks to 1 ????
I want to process this file not in a single node but over the entire 
cluster. I need a lot of processing power in order to finish the job in 
hours instead of days.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message