hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <omal...@apache.org>
Subject Re: Very strange Java Collection behavior in Hadoop
Date Tue, 20 Mar 2012 06:20:22 GMT
On Mon, Mar 19, 2012 at 11:05 PM, madhu phatak <phatak.dev@gmail.com> wrote:

> Hi Owen O'Malley,
>  Thank you for that Instant reply. It's working now. Can you explain me
> what you mean by "input to reducer is reused" in little detail?

Each time the statement "Text value = values.next();" is executed it always
returns the same Text object with the contents of that object changed. When
you add the Text to the list, you are adding a pointer to the same Text
object. At the end you have 6 copies of the same pointer instead of 6
different Text objects.

The reason that I said it is my fault, is because I added the optimization
that causes it. If you are interested in Hadoop archeology, it was
HADOOP-2399 that made the change. I also did HADOOP-3522 to improve the
documentation in the area.

-- Owen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message