hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hammerton <james.hammer...@mendeley.com>
Subject Strange behaviour from a custom Writable
Date Mon, 08 Feb 2010 17:23:50 GMT

For a particular project I created a writable for holding a long and a
double called LongDoublePair. My mapper outputs LongDoublePair values and
the reducer receives an Iterable<LongDoublePair>.

The problem is that when I try to use it, whilst I get the right number of
elements in the Iterable, they are all copies of the same object! I tested
that this was the case by using the following code in the loop that
processes the pairs:

            if (prev != null) {
                if (prev == next) {
                    context.getCounter("MY COUNTERS", key.toString() +
"Values are same object").increment(1);
            } else {
                prev = next;

The counters appeared with all sorts of values, e.g. I got lots of lines
like: "10/02/08 16:57:18 INFO mapred.JobClient:     990Values are same
object=46", indicating that the iterator was iterating through copies of the
same object.

My code works if instead of using the LongDoublePair I use a Text object and
simply concatenate the two number strings with a space to separate them and
have the reducer parse the string into a LongDoublePair and process it.

Via unit tests, I've ensured the LongDoublePair's serialisation and
deserialisation code works, that hashCode and equals do what they should do,
etc, but I can't seem to get this to work other than by falling back on
using Text objects. Any ideas what might be going wrong?

I've attached the source code for LongDoublePair to this email in case you
can spot anything that might be behind the problem.


James Hammerton | Senior Data Mining Engineer

Mendeley Limited | London, UK | www.mendeley.com
Registered in England and Wales | Company Number 6419015

View raw message