hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rakesh kothari <rkothari_...@hotmail.com>
Subject Failures in the reducers
Date Tue, 12 Oct 2010 19:53:11 GMT

Hi,

My MR Job is processing gzipped files each around 450 MB and there are 24 of them. File block
size is 512 MB. 

This job is failing consistently in the reduce phase with the following exception (below).
Any ideas how to troubleshoot this ?

Thanks,
-Rakesh

Datanode logs:



INFO
org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments
left of total size: 408736960 bytes

2010-10-12
07:25:01,020 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink
10.185.13.61:50010

2010-10-12
07:25:01,021 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-961587459095414398_368580

2010-10-12
07:25:07,206 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink
10.185.13.61:50010

2010-10-12
07:25:07,206 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-7795697604292519140_368580

2010-10-12
07:27:05,526 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.EOFException

2010-10-12
07:27:05,527 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-7687883740524807660_368625

2010-10-12
07:27:11,713 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.EOFException

2010-10-12
07:27:11,713 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-5546440551650461919_368626

2010-10-12
07:27:17,898 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.EOFException

2010-10-12
07:27:17,898 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-3894897742813130478_368628

2010-10-12
07:27:24,081 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.EOFException

2010-10-12
07:27:24,081 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8687736970664350304_368652

2010-10-12
07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

 

2010-10-12
07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
blk_8687736970664350304_368652 bad datanode[0] nodes == null

2010-10-12
07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block
locations. Source file
"/tmp/dartlog-json-serializer/20100929_/_temporary/_attempt_201010082153_0040_r_000000_2/jp/dart-imp-json/2010/09/29/17/part-r-00000.gz"
- Aborting...

2010-10-12
07:27:30,196 WARN org.apache.hadoop.mapred.TaskTracker: Error running child

java.io.EOFException

       
at java.io.DataInputStream.readByte(DataInputStream.java:250)

       
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)

       
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)

       
at org.apache.hadoop.io.Text.readString(Text.java:400)

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2868)

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793)

       
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)

       
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

2010-10-12
07:27:30,199 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the
task



Namenode is throwing following exception:

2010-10-12 07:27:30,026 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-892355450837523222_368657 src: /10.43.102.69:42352 dest: /10.43.102.69:500102010-10-12
07:27:30,206 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-892355450837523222_368657
received exception java.io.EOFException2010-10-12 07:27:30,206 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.43.102.69:50010, storageID=DS-859924705-10.43.102.69-50010-1271546912162,
infoPort=8501, ipcPort=50020):DataXceiverjava.io.EOFException        at java.io.DataInputStream.readByte(DataInputStream.java:250)
       at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
       at org.apache.hadoop.io.Text.readString(Text.java:400)        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:313)
       at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)   
    at java.lang.Thread.run(Thread.java:619)2010-10-12 07:27:30,272 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Receiving block blk_786696549206331718_368657 src: /10.184.82.24:53457 dest: /10.43.102.69:500102010-10-12
07:27:30,459 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-6729043740571856940_368657
src: /10.185.13.60:41816 dest: /10.43.102.69:500102010-10-12 07:27:30,468 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.185.13.61:48770, dest: /10.43.102.69:50010, bytes: 1626784, op: HDFS_WRITE, cliID:
DFSClient_attempt_201010082153_0040_r_000000_2, srvID: DS-859924705-10.43.102.69-50010-1271546912162,
blockid: blk_9216465415312085861_3686112010-10-12 07:27:30,468 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
PacketResponder 0 for block blk_9216465415312085861_368611 terminating2010-10-12 07:27:30,755
INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_5680087852988027619_3212442010-10-12
07:27:30,759 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded
for blk_-1637914415591966611_321290

…

2010-10-12 07:27:56,412 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.43.102.69:50010,
storageID=DS-859924705-10.43.102.69-50010-1271546912162, infoPort=8501, ipcPort=50020):DataXceiverjava.io.IOException:
xceiverCount 258 exceeds the limit of concurrent xcievers 256        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:88)
       at java.lang.Thread.run(Thread.java:619)2010-10-12 07:27:56,976 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:
Verification succeeded for blk_5731266331675183628_3212382010-10-12 07:27:57,669 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.43.102.69:50010, storageID=DS-859924705-10.43.102.69-50010-1271546912162,
infoPort=8501, ipcPort=50020):DataXceiverjava.io.IOException: xceiverCount 258 exceeds the
limit of concurrent xcievers 256        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:88)
       at java.lang.Thread.run(Thread.java:619)2010-10-12 07:27:58,976 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.43.102.69:50010, storageID=DS-859924705-10.43.102.69-50010-1271546912162,
infoPort=8501, ipcPort=50020):DataXceiverjava.io.IOException: xceiverCount 258 exceeds the
limit of concurrent xcievers 256        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:88)
       at java.lang.Thread.run(Thread.java:619)


 		 	   		  
Mime
View raw message