hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: splits and maps
Date Wed, 19 Sep 2012 16:24:57 GMT
Hi Tim,

Splits don't look at newlines in the TextInputFormat at least. So
since the computed splits > default map numbers, I think a perfect
file of 10 blocks will spawn only 10 mappers. The mapper's record
reader is the one that reads until a newline (even after the end of
its block length bytes).

On Wed, Sep 19, 2012 at 9:16 PM, Tim Robertson
<timrobertson100@gmail.com> wrote:
> I think the splitting recognises the end of line, so you might get 11 but
> otherwise that looks correct.
>
>
>
> On Wed, Sep 19, 2012 at 5:42 PM, Pedro Sá da Costa <psdc1978@gmail.com>
> wrote:
>>
>>
>>
>> If I've an input  file of 640MB in size, and a split size of 64Mb, this
>> file will be partitioned in 10 splits, and each split will be processed by a
>> map task, right?
>>
>> --
>> Best regards,
>>
>



-- 
Harsh J

Mime
View raw message