hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark question <markq2...@gmail.com>
Subject Connection reset by peer Error
Date Sun, 20 Nov 2011 22:40:07 GMT
Hi,

I've been getting this error multiple times now, the namenode mentions
something about peer resetting connection, but I don't know why this is
happening, because I'm running on a single machine with 12 cores .... any
ideas?

The job starting running normally, which contains about 200 mappers each
opens 200 files (one file at a time inside mapper code) then:
......
.....
...
11/11/20 06:27:52 INFO mapred.JobClient:  map 55% reduce 0%
11/11/20 06:28:38 INFO mapred.JobClient:  map 56% reduce 0%
11/11/20 06:29:18 INFO mapred.JobClient: Task Id :
attempt_201111200450_0001_m_
000219_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/mark/output/_temporary/_attempt_201111200450_0001_m_000219_0/part-00219
could only be replicated to 0 nodes, instead of 1
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

    at org.apache.hadoop.ipc.Client.call(Client.java:740)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.addBlock(Unknown Source)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

   .......
   .......

 Namenode Log:

2011-11-20 06:29:51,964 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=mark,ucsb
ip=/127.0.0.1    cmd=open    src=/user/mark/input/G14_10_al    dst=null
perm=null
2011-11-20 06:29:52,039 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=mark,ucsb
ip=/127.0.0.1    cmd=open    src=/user/mark/input/G13_12_aq    dst=null
perm=null
2011-11-20 06:29:52,178 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=mark,ucsb
ip=/127.0.0.1    cmd=open    src=/user/mark/input/G14_10_an    dst=null
perm=null
2011-11-20 06:29:52,348 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
blk_-2308051162058662821_1643 size 20024660
2011-11-20 06:29:52,348 INFO org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.completeFile: file
/user/mark/output/_temporary/_attempt_201111200450_0001_m_000222_0/part-00222
is closed by DFSClient_attempt_201111200450_0001_m_000222_0
2011-11-20 06:29:52,351 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
blk_9206172750679206987_1639 size 51330092
2011-11-20 06:29:52,352 INFO org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.completeFile: file
/user/mark/output/_temporary/_attempt_201111200450_0001_m_000226_0/part-00226
is closed by DFSClient_attempt_201111200450_0001_m_000226_0
2011-11-20 06:29:52,416 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=mark,ucsb
ip=/127.0.0.1    cmd=create
src=/user/mark/output/_temporary/_attempt_201111200450_0001_m_000223_2/part-00223
dst=null    perm=mark:supergroup:rw-r--r--
2011-11-20 06:29:52,430 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 12123: readAndProcess threw exception
java.io.IOException:Connection reset by peer. Count of bytes read: 0
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
    at sun.nio.ch.IOUtil.read(IOUtil.java:175)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
    at org.apache.hadoop.ipc.Server.channelRead(Server.java:1211)
    at org.apache.hadoop.ipc.Server.access$2300(Server.java:77)
    at
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:799)
    at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:419)
    at org.apache.hadoop.ipc.Server$Listener.run(Server.java:328)

TaskTracker:

2011-11-20 06:29:55,772 WARN org.apache.hadoop.fs.FileSystem:
"localhost:12123" is a deprecated filesystem name. Use
"hdfs://localhost:12123/" instead.
2011-11-20 06:30:01,441 ERROR org.apache.hadoop.mapred.TaskTracker: Caught
exception: java.io.IOException: Call to localhost/127.0.0.1:10001 failed on
local exception: java.io.EOFException
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
    at org.apache.hadoop.ipc.Client.call(Client.java:743)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
    at org.apache.hadoop.mapred.$Proxy4.heartbeat(Unknown Source)
    at
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1215)
    at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1037)
    at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1720)
    at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833)
Caused by: java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:375)
    at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)

2011-11-20 06:30:01,441 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'localhost' with reponseId '1936


Thank you,
Mark

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message