hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ed Mazur <ma...@cs.umass.edu>
Subject Re: Strange behaviour from a custom Writable
Date Mon, 08 Feb 2010 19:09:06 GMT
Hi James,

I ran into something similar in the past and suspect the problem may
be in your reduce function. Are you buffering values from the
iterator? If you are, then you need to first clone the value when
taking it from the iterator (implement Cloneable in your custom
Writable). Otherwise they will all be references to the last item from
the iterator.

Ed

On Mon, Feb 8, 2010 at 12:23 PM, James Hammerton
<james.hammerton@mendeley.com> wrote:
> Hi,
>
> For a particular project I created a writable for holding a long and a
> double called LongDoublePair. My mapper outputs LongDoublePair values and
> the reducer receives an Iterable<LongDoublePair>.
>
> The problem is that when I try to use it, whilst I get the right number of
> elements in the Iterable, they are all copies of the same object! I tested
> that this was the case by using the following code in the loop that
> processes the pairs:
>
>             if (prev != null) {
>                 if (prev == next) {
>                     context.getCounter("MY COUNTERS", key.toString() +
> "Values are same object").increment(1);
>                 }
>             } else {
>                 prev = next;
>             }
>
> The counters appeared with all sorts of values, e.g. I got lots of lines
> like: "10/02/08 16:57:18 INFO mapred.JobClient:     990Values are same
> object=46", indicating that the iterator was iterating through copies of the
> same object.
>
> My code works if instead of using the LongDoublePair I use a Text object and
> simply concatenate the two number strings with a space to separate them and
> have the reducer parse the string into a LongDoublePair and process it.
>
> Via unit tests, I've ensured the LongDoublePair's serialisation and
> deserialisation code works, that hashCode and equals do what they should do,
> etc, but I can't seem to get this to work other than by falling back on
> using Text objects. Any ideas what might be going wrong?
>
> I've attached the source code for LongDoublePair to this email in case you
> can spot anything that might be behind the problem.
>
> James
>
> --
> James Hammerton | Senior Data Mining Engineer
> www.mendeley.com/profiles/james-hammerton
>
> Mendeley Limited | London, UK | www.mendeley.com
> Registered in England and Wales | Company Number 6419015
>
>
>
>

Mime
View raw message