kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Jones <rick.jon...@hpe.com>
Subject Re: Connection reset by peer Error
Date Mon, 20 Jun 2016 15:44:26 GMT
On 06/20/2016 05:28 AM, Avi Asulin wrote:
> Hi
> We are using kafka 0.8.2 with scala 2.10 version
> We currently have 3 brokers and we are working with ~ 170 producers
> We frequently get the Error
>
> ERROR Closing socket for /170.144.181.50 because of error
> (kafka.network.Processor)
> java.io.IOException: Connection reset by peer
>          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>          at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384)
>          at kafka.utils.Utils$.read(Utils.scala:380)
>          at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>          at kafka.network.Processor.read(SocketServer.scala:444)
>          at kafka.network.Processor.run(SocketServer.scala:340)
>          at java.lang.Thread.run(Thread.java:745)
>
> we get the error on many producers ips
> Can somone explain what can cause this error and what can be done to get
> rid of it?

There can be a few different reasons for a connection reset by peer 
(arrival of a reset, aka RST, segment).

TCP provides the semantics of a full-duplex data stream.  But it also 
supports a simplex stream (data in one direction only).  So, you can 
tell TCP, via the shutdown() call, whether one expects to no longer send 
data (SHUT_WR) or receive data (SHUT_RD).  If one sets SHUT_RD, and data 
later arrives, TCP will interpret that as an error and reset the 
connection - send an ReSeT segment.

A close() call is implicitly setting both SHUT_WR and SHUT_RD.  So, if 
the far side called close(), and data arrived, that would trigger a RST.

Also, some applications might abuse SO_LINGER to generate what is called 
an "abortive close" which sends a RST immediately.

Further, if a TCP ends-up retransmitting to timeout, say because ACKs 
were being lost, it may sent an RST, and that RST could make it through.

happy benchmarking,

rick jones

Mime
View raw message