hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bejoy.had...@gmail.com
Subject Re: num of reducer
Date Thu, 16 Feb 2012 16:21:37 GMT
Hi Tamizh
         If your input comprises of text files then changing the input format to TextInputFormat
can get things right. One mapper for each hdfs block.


Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Thamizhannal Paramasivam <thamizhannal.p@gmail.com>
Date: Thu, 16 Feb 2012 21:33:11 
To: <mapreduce-user@hadoop.apache.org>
Reply-To: mapreduce-user@hadoop.apache.org
Subject: Re: num of reducer

Here are the input format for mapper.
Input Format: MultiFileInputFormat
MapperOutputKey : Text
MapperOutputValue: CustomWritable

I shall not be in the position to upgrade hadoop-0.19.2 for some reason.

I have checked in number of mapper on job-tracker.

Thanks,
Thamizh

On Thu, Feb 16, 2012 at 6:56 PM, Joey Echeverria <joey@cloudera.com> wrote:

> Hi Tamil,
>
> I'd recommend upgrading to a newer release as 0.19.2 is very old. As for
> your question, most input formats should set the number mappers correctly.
> What input format are you using? Where did you see the number of tasks it
> assigned to the job?
>
> -Joey
>
>
> On Thu, Feb 16, 2012 at 1:40 AM, Thamizhannal Paramasivam <
> thamizhannal.p@gmail.com> wrote:
>
>> Hi All,
>> I am using hadoop-0.19.2 and running a Mapper only Job on cluster. It's
>> input path has >1000 files of 100-200MB. Since, it is Mapper only job, I
>> gave number Of reducer=0. So, it is using 2 mapper to run all the input
>> files. If we did not state the number of mapper, would n't it pick the 1
>> mapper per input file? Or Does the default won't it pick a fair num of
>> mapper according to number input file?
>> Thanks,
>> tamil
>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>
>

Mime
View raw message