hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: WholeFileInputFormat format
Date Tue, 10 Jul 2012 13:25:20 GMT
It depends on what you need. If your file is not splittable, or if you
need to read the whole file from a single mapper itself (i.e. you do
not _want_ it to be split), then use WholeFileInputFormats. Otherwise,
you get more parallelism with regular splitting.

On Tue, Jul 10, 2012 at 6:31 PM, Mohammad Tariq <dontariq@gmail.com> wrote:
> Hello list,
>        What could be the approximate maximum size of the files that
> can be handled using WholeFileInputFormat format??I mean, if the file
> is very big, then is it feasible to use WholeFileInputFormat as the
> entire load will go to one mapper??Many thanks.
> Regards,
>     Mohammad Tariq

Harsh J

View raw message