hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehmet Tepedelenlioglu <mehmets...@gmail.com>
Subject Hadoop in Action Partitioner Example
Date Tue, 23 Aug 2011 23:25:35 GMT
For those of you who has the book, on page 49 there is a custom partitioner example. It basically
describes a situation where the map emits <K,V>, but the key is a compound key like
(K1,K2), and we want to reduce over K1s and not the whole of the Ks. This is used as an example
of a situation where a custom partitioner should be written to hash over K1 to send the right
keys to the same reducers. But as far as I know, although this would partition the keys correctly
(send them to the correct reducers), the reduce function would still be called (grouped under)
with the original keys K, not yielding the desired results. The only way of doing this that
I know of is to create a new WritableComparable, that carries all of K, but only uses K1 for
hash/equal/compare methods, in which case you would not need to write your own partitioner
anyways. Am I misinterpreting something the author meant, or is there something I don't know
going on? It would have been sweet if I could accomplish all that with just the partitioner.
Either I am misunderstanding something fundamental, or I am misunderstanding the example's
intention, or there is something wrong with it. 



View raw message