incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: frequent node UP/Down?
Date Wed, 28 Sep 2011 02:22:46 GMT
found the reason.

the IncomingTCPConnection.run() hit an exception and the thread
terminated. the next incarnation of the thread did not come up until
20 seconds later, which caused the TimedOutException and
UNavalableException to clients.



 WARN [Thread-28] 2011-09-28 02:17:57,561 IncomingTcpConnection.java
(line 122) eof reading from socket; closing
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:112)



I don't know whether the EOF here is really due to network or something in code
(if it's really network, is there a way to let IncomingTCPConnection
fire up the next one faster? like within 1 second.... I'm reading
through the code to find it )

Thanks
Yang



On Sun, Sep 25, 2011 at 1:04 PM, Brandon Williams <driftx@gmail.com> wrote:
> On Sun, Sep 25, 2011 at 1:10 PM, Yang <teddyyyy123@gmail.com> wrote:
>> Thanks Brandon.
>>
>> I'll try this.
>>
>> but you can also see my later post regarding message drop :
>> http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3CCAAnh3_8AeHidYH9ybt82_EMH3LikbCDseNRak3JHfzaJ2L+9zQ@mail.gmail.com%3E
>>
>> that seems to show something in either code or background load causing
>> messages to be really dropped
>
> I see.  My guess is then this: there is a local clock problem, causing
> generations to be the same, thus not notifying the FD.  So perhaps the
> problem is not network-related, but it is something in the ec2
> environment.
>
> -Brandon
>

Mime
View raw message