hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elliott Clark (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size
Date Wed, 20 Jan 2016 07:49:39 GMT
Elliott Clark created HDFS-9669:

             Summary: TcpPeerServer should respect ipc.server.listen.queue.size
                 Key: HDFS-9669
                 URL: https://issues.apache.org/jira/browse/HDFS-9669
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Elliott Clark

On periods of high traffic we are seeing:

16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect to /
for file /MYPATH/MYFILE for block BP-1935559084-
Connection reset by peer
java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
	at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
	at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)

At the time that this happens there are way less xceivers than configured.

On most JDK's this will make 50 the total backlog at any time. This effectively means that
any GC + Busy time willl result in tcp resets.


This message was sent by Atlassian JIRA

View raw message