Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4DE810FA4 for ; Mon, 6 Jan 2014 20:21:31 +0000 (UTC) Received: (qmail 39482 invoked by uid 500); 6 Jan 2014 20:21:26 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 39233 invoked by uid 500); 6 Jan 2014 20:21:26 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 39226 invoked by uid 99); 6 Jan 2014 20:21:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jan 2014 20:21:26 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=FSL_HELO_BARE_IP_2,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pmahon@decarta.com designates 208.81.204.125 as permitted sender) Received: from [208.81.204.125] (HELO mx1.decarta.com) (208.81.204.125) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jan 2014 20:21:21 +0000 Received: from dct-mail.sanjose.telcontar.com (dct-mail.sanjose.telcontar.com [10.253.0.17]) by mx1.decarta.com (Postfix) with ESMTP id 238666048A for ; Mon, 6 Jan 2014 12:21:01 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from 10.0.1.175 ([10.0.1.175]) by dct-mail.sanjose.telcontar.com ([10.253.0.17]) via Exchange Front-End Server mail.decarta.com ([10.253.0.15]) with Microsoft Exchange Server HTTP-DAV ; Mon, 6 Jan 2014 20:21:00 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 Content-class: urn:content-classes:message Subject: Spill Failed Caused by ArrayIndexOutOfBoundsException Date: Mon, 6 Jan 2014 12:21:00 -0800 Message-ID: <6C5C1804772DB944BA88A0DC48D338DA0CD3F45C@dct-mail.sanjose.telcontar.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Spill Failed Caused by ArrayIndexOutOfBoundsException Thread-Index: Ac8LHND6nPjcY6CCRRaaVrcjCR1bHg== From: "Paul Mahon" To: X-Virus-Checked: Checked by ClamAV on apache.org I have a hadoop program that I'm running with version 1.2.1 which=20 fails in a peculiar place. Most mappers complete without error, but=20 some fail with this stack trace: java.io.IOException: Spill failed at=20 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1297)= at=20 org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:69= 8) at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1793) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at=20 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ArrayIndexOutOfBoundsException: 99614720 at=20 org.apache.hadoop.io.WritableComparator.readInt(WritableComparator.java:1= 58) at=20 org.apache.hadoop.io.BooleanWritable$Comparator.compare(BooleanWritable.j= ava:103) at=20 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:111= 6) at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:95) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59) at=20 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.jav= a:1404) at=20 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java= :858) at=20 org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.= java:1349) I've noticed that that array index is exactly the size of the bufvoid,=20 but I'm not sure if that has any significance. The exception isn't happening in my WritableComparable or any of my=20 code, it's all in hadoop. I'm not sure what to do to track down what=20 I'm doing to cause the problem. Has anyone seen a problem like this or=20 have any suggestions of where to look for the problem in my code?