hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <...@yahoo-inc.com>
Subject Beware sun's jvm version 1.6.0_05-b13 on linux
Date Fri, 15 May 2009 18:38:32 GMT
We have observed that the default jvm on RedHat 5 can cause  
significant data corruption in the map/reduce shuffle for those using  
Hadoop 0.20. In particular, the guilty jvm is:

java version "1.6.0_05"
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode)

By upgrading to jvm build 1.6.0_13-b03, we fixed the problem. The  
observed behavior is that Jetty serves up random bytes from other  
transfers. In particular, some of them were valid transfers to the  
wrong reduce. We suspect the relevant java bug is:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933

We have also filed a bug on Hadoop to add sanity checks on the shuffle  
that will work around the problem:

https://issues.apache.org/jira/browse/HADOOP-5783

-- Owen

Mime
View raw message