hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <ext-fabio.alme...@nokia.com>
Subject RE: Partitioners - How to know if they are working
Date Fri, 17 Feb 2012 15:24:21 GMT
Hello David,

I am following your tip! Thanks.

Also, I configured a small cluster with three datanodes and on my MR program I printed every
single key that the reducers received. I set three reducers(setNumReduceTasks).

Analyzing the reducer outputs I could see that the keys were distributed as my partitioner

Of course, I had to make things much much smaller than real. I prepared an input, built a
small cluster and so on .... to assure a minimal control. 

Not that I doubt hadoop, I doubt my code, always! :-)

Fabio Almeida 

-----Original Message-----
From: ext David Rosenstrauch [mailto:darose@darose.net] 
Sent: Friday, February 17, 2012 12:16 AM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Partitioners - How to know if they are working

On 02/16/2012 12:49 PM, ext-fabio.almeida@nokia.com wrote:
> Hello All,
> I wrote my own partitioner and I would like to see if it's working.
> By printing the return of method getPartition I could see that the partitions were different,
but were they really working? To answer that I got the keys that every reducer task processed
and that was what I expected. It seems my partitioner is working properly. But not easy to
discover though.
> Does anyone know if there is an easier way to see if your customized partitioner is working?
For instance, a counter that shows how many partitioners a map generated or a reducer received?
> Thanks in advance,
> Fabio Almeida

At my last job we wrote a custom partitioner, and we tested it out completely outside of Hadoop
using standard JUnit unit tests.



View raw message