hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: reducer behavior
Date Sat, 21 Jan 2012 19:54:42 GMT
The only difference would be that with 4 reducers your keys would get partitioned based on
their hashCode() implementation (if you use the default hash partitioner) (I'd check the key
impl. here, first thing, if its a custom key impl.), and each be sent to one reducer. 

Check the input record counters on your reducers, and the total map output record counters
- they should add up and be equal to the latter. Also make sure you aren't skipping out on
the reducer iterator under any condition, when you are doing the reducer op.

I'm guessing its mostly your logic that's somehow causing this but I do not have your source
bits to say that for sure.

On 21-Jan-2012, at 11:47 PM, Thamizhannal Paramasivam wrote:

> Hi All,
> I am experimenting MapReduce program on Hadoop-0.19. This program has single input file
with 7 records(later it can have many records on multiple files) and each input suppose to
produce 11 output records. When it runs with no_of_reducer=4. It produces only 33 records.
But, when I ran with no_of_reducer=1 then it produces 77 records as expected.
> 
> What could be the reason for this ? I am missing any configuration parameter.
> 
> Thanks
> Tamil
> 

--
Harsh J
Customer Ops. Engineer, Cloudera


Mime
View raw message