hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Regarding MapReduce Input Format
Date Wed, 07 Nov 2012 16:38:01 GMT
You are correct. (D) automatically does (B).

On Wed, Nov 7, 2012 at 9:41 PM, Ramasubramanian Narayanan
<ramasubramanian.narayanan@gmail.com> wrote:
> Hi,
>
> I came across the below question and I feel 'D' is the correct answer but in
> some site it is mentioned that 'B' is the correct answer... Can you please
> tell which is the right one with explanation pls...
>
> In a MapReduce job, you want each of you input files processed by a single
> map task. How do you
> configure a MapReduce job so that a single map task processes each input
> file regardless of how
> many blocks the input file occupies?
> A. Increase the parameter that controls minimum split size in the job
> configuration.
> B. Write a custom MapRunner that iterates over all key-value pairs in the
> entire file.
> C. Set the number of mappers equal to the number of input files you want to
> process.
> D. Write a custom FileInputFormat and override the method isSplittable to
> always return false.
>
> regards,
> Rams



-- 
Harsh J

Mime
View raw message