cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: tcp CLOSE_WAIT bug
Date Thu, 22 Apr 2010 02:03:14 GMT
But those connections aren't supposed to ever terminate unless a node
dies or is partitioned.  So if we "fix" it by adding a socket.close I
worry that we're covering up something more important.

On Wed, Apr 21, 2010 at 8:53 PM, Ingram Chen <ingramchen@gmail.com> wrote:
> I agree your point. I patch the code and log more informations to find out
> the real cause.
>
> Here is the code snip I think may be the cause:
>
> IncomingTcpConnection:
>
>     public void run()
>     {
>         while (true)
>         {
>             try
>             {
>                 MessagingService.validateMagic(input.readInt());
>                 int header = input.readInt();
>                 int type = MessagingService.getBits(header, 1, 2);
>                 boolean isStream = MessagingService.getBits(header, 3,
1) ==
> 1;
>                 int version = MessagingService.getBits(header, 15, 8);
>
>                 if (isStream)
>                 {
>                     new IncomingStreamReader(socket.getChannel()).read();
>                 }
>                 else
>                 {
>                     int size = input.readInt();
>                     byte[] contentBytes = new byte[size];
>                     input.readFully(contentBytes);
>                     MessagingService.getDeserializationExecutor().submit(new
> MessageDeserializationTask(new ByteArrayInputStream(contentBytes)));
>                 }
>             }
>             catch (EOFException e)
>             {
>                 if (logger.isTraceEnabled())
>                     logger.trace("eof reading from socket; closing",
e);
>                 break;
>             }
>             catch (IOException e)
>             {
>                 if (logger.isDebugEnabled())
>                     logger.debug("error reading from socket; closing",
e);
>                 break;
>             }
>         }
>     }
>
> In normal condition, while loop is terminated after input.readInt() throw
> EOFException. but it quits without socket.close(). what I do is wrap whole
> while block inside a try { ... } finally {socket.close();}
>
>
> On Thu, Apr 22, 2010 at 01:14, Jonathan Ellis <jbellis@gmail.com> wrote:
>>
>> I'd like to get something besides "I'm seeing close wait but i have no
>> idea why" for a bug report, since most people aren't seeing that.
>>
>> On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen <ingramchen@gmail.com> wrote:
>> > I trace IncomingStreamReader source and found that incoming socket comes
>> > from MessagingService$SocketThread.
>> > but there is no close() call on either accepted socket or socketChannel.
>> >
>> > Should I file a bug report ?
>> >
>> > On Tue, Apr 20, 2010 at 11:02, Ingram Chen <ingramchen@gmail.com> wrote:
>> >>
>> >> this happened after several hours of operations and both nodes are
>> >> started
>> >> at the same time (clean start without any data). so it might not relate
>> >> to
>> >> Bootstrap.
>> >>
>> >> In system.log I do not see any logs like "xxx node dead" or exceptions.
>> >> and both nodes in test are alive. they serve read/write well, too.
>> >> Below
>> >> four connections between nodes are keep healthy from time to time.
>> >>
>> >> tcp        0      0 ::ffff:192.168.2.87:7000
>> >> ::ffff:192.168.2.88:58447   ESTABLISHED
>> >> tcp        0      0 ::ffff:192.168.2.87:54986
>> >> ::ffff:192.168.2.88:7000    ESTABLISHED
>> >> tcp        0      0 ::ffff:192.168.2.87:59138
>> >> ::ffff:192.168.2.88:7000    ESTABLISHED
>> >> tcp        0      0 ::ffff:192.168.2.87:7000
>> >> ::ffff:192.168.2.88:39074   ESTABLISHED
>> >>
>> >> so connections end in CLOSE_WAIT should be newly created. (for
>> >> streaming
>> >> ?) This seems related to streaming issues we suffered recently:
>> >> http://n2.nabble.com/busy-thread-on-IncomingStreamReader-td4908640.html
>> >>
>> >> I would like add some debug codes around opening and closing of socket
>> >> to
>> >> find out what happend.
>> >>
>> >> Could you give me some hint, about what classes I should take look ?
>> >>
>> >>
>> >> On Tue, Apr 20, 2010 at 04:47, Jonathan Ellis <jbellis@gmail.com>
>> >> wrote:
>> >>>
>> >>> Is this after doing a bootstrap or other streaming operation?  Or did
>> >>> a node go down?
>> >>>
>> >>> The internal sockets are supposed to remain open, otherwise.
>> >>>
>> >>> On Mon, Apr 19, 2010 at 10:56 AM, Ingram Chen <ingramchen@gmail.com>
>> >>> wrote:
>> >>> > Thank your information.
>> >>> >
>> >>> > We do use connection pools with thrift client and ThriftAdress
is on
>> >>> > port
>> >>> > 9160.
>> >>> >
>> >>> > Those problematic connections we found are all in port 7000, which
>> >>> > is
>> >>> > internal communications port between
>> >>> > nodes. I guess this related to StreamingService.
>> >>> >
>> >>> > On Mon, Apr 19, 2010 at 23:46, Brandon Williams <driftx@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> On Mon, Apr 19, 2010 at 10:27 AM, Ingram Chen
>> >>> >> <ingramchen@gmail.com>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Hi all,
>> >>> >>>
>> >>> >>>     We have observed several connections between nodes
in
>> >>> >>> CLOSE_WAIT
>> >>> >>> after several hours of operation:
>> >>> >>
>> >>> >> This is symptomatic of not pooling your client connections
>> >>> >> correctly.
>> >>> >>  Be
>> >>> >> sure you're using one connection per thread, not one connection
per
>> >>> >> operation.
>> >>> >> -Brandon
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Ingram Chen
>> >>> > online share order: http://dinbendon.net
>> >>> > blog: http://www.javaworld.com.tw/roller/page/ingramchen
>> >>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Ingram Chen
>> >> online share order: http://dinbendon.net
>> >> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>> >
>> >
>> >
>> > --
>> > Ingram Chen
>> > online share order: http://dinbendon.net
>> > blog: http://www.javaworld.com.tw/roller/page/ingramchen
>> >
>
>
>
> --
> Ingram Chen
> online share order: http://dinbendon.net
> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>

Mime
View raw message