hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trevor Adams <trevorad...@gmail.com>
Subject Reduce method called same key twice
Date Wed, 29 Jun 2011 16:59:52 GMT
So I have a custom Key which is used for a join. It contains two fields, a
boolean (is primary key) and an int (key). Hashcode only looks at the key
field, so that it gets sent to the same reducer. Compare places the pkey at
the top of the list (if sorted using compare). This works nicely, except
that the reduce method is called with Key: 1 -> a single value, Key: 1 ->
another value etc. One for each value, so instead of bucketing the values to
a key (and some of the keys are identical, in every way) it sends 1 key and
1 value to the reducer at a time. How do I get it to bucket or why isn't it


View raw message