hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Partitioners - How to know if they are working
Date Thu, 16 Feb 2012 18:34:07 GMT
Hi Fabio,

There are test cases in the MapReduce project releases that test
setting a custom partitioner and ensuring it works as intended.

But if you still wish to assert/assure self, you should be able to add
a LOG statement to your custom Partitioner class's initialization
methods, that may indicate its being initialized - so that you can see
it on each map task's user logs.

There are other ways as well but essentially, there is no "fallback"
partitioner in case a user-specified partitioner is not initializable
- tasks would fail if you've misconfigured the partitioner.

For counters - there are no per-partition counters at the map end
(they could end up being too many depending on the number of reducers
you have for the job) but there are per-reduce-task input record
counters in each reduce task you can use to get the count of number of
keys that came into a specific partition.

For generally testing your MR code end to end, I recommend using the
Apache MRUnit library available at http://incubator.apache.org/mrunit/

On Thu, Feb 16, 2012 at 11:19 PM,  <ext-fabio.almeida@nokia.com> wrote:
> Hello All,
>
> I wrote my own partitioner and I would like to see if it’s working.
>
> By printing the return of method getPartition I could see that the
> partitions were different, but were they really working? To answer that I
> got the keys that every reducer task processed and that was what I expected.
> It seems my partitioner is working properly. But not easy to discover
> though.
>
> Does anyone know if there is an easier way to see if your customized
> partitioner is working? For instance, a counter that shows how many
> partitioners a map generated or a reducer received?
>
> Thanks in advance,
>
> Fabio Almeida



-- 
Harsh J
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about

Mime
View raw message