incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Chunlu <springri...@gmail.com>
Subject Re: cassandra halt after started minutes later
Date Sun, 01 Jul 2012 15:33:40 GMT
adjust the timezone of java by  -Duser.timezone   and the timezone of
cassandra is the same with system(Debian 6.0).

after restart cassandra I found the following error message in the log file
of node B. after about 2 minutes later, node C stop responding....

the error log of node B:

Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
 at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
 at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)



the log info in node C:


DEBUG [MutationStage:25] 2012-07-01 23:29:42,909
RowMutationVerbHandler.java (line 60) RowMutation(keyspace='spark',
key='39373438366235383638373631353532643133393334633435326333323634373131656462306139',
modifications=[ColumnFamily(permacache
[76616c7565:false:67906@1341156582948365,])]) applied.  Sending response to
79529@/192.168.1.129
DEBUG [pool-2-thread-209] 2012-07-01 23:29:42,913 CassandraServer.java
(line 523) insert
DEBUG [pool-2-thread-209] 2012-07-01 23:29:42,913 StorageProxy.java (line
172) Mutations/ConsistencyLevel are [RowMutation(keyspace='spark',
key='636f6d6d656e74735f706172656e74735f32373232343938',
modifications=[ColumnFamily(permacache [76616c7565:false:6@1341156582953843
,])])]/QUORUM
DEBUG [pool-2-thread-209] 2012-07-01 23:29:42,913 StorageProxy.java (line
301) insert writing key 636f6d6d656e74735f706172656e74735f32373232343938 to
/192.168.1.40
DEBUG [pool-2-thread-209] 2012-07-01 23:29:42,913 StorageProxy.java (line
301) insert writing key 636f6d6d656e74735f706172656e74735f32373232343938 to
/192.168.1.129
DEBUG [Thread-8] 2012-07-01 23:29:42,913 IncomingTcpConnection.java (line
116) Version is now 3
DEBUG [RequestResponseStage:27] 2012-07-01 23:29:42,913
ResponseVerbHandler.java (line 44) Processing response on a callback from
50050@/192.168.1.129
DEBUG [Thread-12] 2012-07-01 23:29:42,914 IncomingTcpConnection.java (line
116) Version is now 3
DEBUG [RequestResponseStage:29] 2012-07-01 23:29:42,914
ResponseVerbHandler.java (line 44) Processing response on a callback from
50051@/192.168.1.40
DEBUG [Thread-11] 2012-07-01 23:29:42,939 IncomingTcpConnection.java (line
116) Version is now 3



On Sun, Jul 1, 2012 at 11:14 PM, Yan Chunlu <springrider@gmail.com> wrote:

> I have a three node cluster running 1.0.2, today there's a very strange
> problem that suddenly two of cassandra  node(let's say B and C) was costing
> a lot of cpu, turned out for some reason the "java" binary just dont
> run.... I am using OpenJDK1.6.0_18, so I switched to "sun jdk", which works
> okay.
>
> after that node A stop working... same problem, I install "sun jdk", then
> it's okay. but minutes later, B stop working again, about 5-10 minutes
> later after the cassandra started, it stop responding connections, I can't
> access 9160 and nodetool dont return either.
>
> I have turned on DEBUG and dont see much useful information, the last rows
> on node B are as belows:
> DEBUG [pool-2-thread-72] 2012-07-01 07:45:42,830 RowDigestResolver.java
> (line 65) resolving 2 responses
> DEBUG [pool-2-thread-72] 2012-07-01 07:45:42,830 RowDigestResolver.java
> (line 106) digests verified
> DEBUG [pool-2-thread-72] 2012-07-01 07:45:42,830 RowDigestResolver.java
> (line 110) resolve: 0 ms.
> DEBUG [pool-2-thread-72] 2012-07-01 07:45:42,831 StorageProxy.java (line
> 694) Read: 5 ms.
> DEBUG [Thread-8] 2012-07-01 07:45:42,831 IncomingTcpConnection.java (line
> 116) Version is now 3
> DEBUG [Thread-8] 2012-07-01 07:45:42,831 IncomingTcpConnection.java (line
> 116) Version is now 3
>
>
> this problem is really driving me crazy since I just dont know what
> happened, and how to debug it, I tried to kill node A and restart it, then
> node B halt, after I restart B, then node C goes down......
>
>
> one thing may related is that the log time on node B is not the same with
> the system time(A and C are okay).
>
> while date on node B shows:
> Sun Jul  1 23:10:57 CST 2012 (system time)
>
> but you may noticed that the time is "2012-07-01 07:45:XX" in those above
> log message.  the system time is right, just not sure why cassandra's log
> file shows the wrong time, I didn't recall cassandra have timezone
> settings.....
>
>
>
>
>

Mime
View raw message