incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Teodor Sigaev <teo...@sigaev.ru>
Subject Re: Cassandra restart
Date Thu, 15 Oct 2009 11:39:55 GMT
> I will try to reproduce problem on smaller test cluster.
It was rather easy, cluster contains 4 servers.
Log's fragment from restarted node (10.2.3.38):

DEBUG [pool-1-thread-64] 2009-10-15 14:18:16,290 CassandraServer.java (line 214) 
get_slice
DEBUG [pool-1-thread-64] 2009-10-15 14:18:16,290 StorageProxy.java (line 239) 
weakreadlocal reading SliceFromReadCommand(table='Keyspace1', 
key='0000000000000000000000000000000000849706', 
column_parent='QueryPath(columnFamilyName='Super1', 
superColumnName='[B@6ca50fbe', columnName='null')', start='1', finish='0', 
reversed=true, count=2)
DEBUG [pool-1-thread-64] 2009-10-15 14:18:16,290 StorageProxy.java (line 251) 
weakreadremote reading SliceFromReadCommand(table='Keyspace1', 
key='0000000000000000000000000000000000849706', 
column_parent='QueryPath(columnFamilyName='Super1', 
superColumnName='[B@6ca50fbe', columnName='null')', start='1', finish='0', 
reversed=true, count=2) from 207911@10.3.2.40:7000
...
ERROR [pool-1-thread-64] 2009-10-15 14:18:21,281 Cassandra.java (line 679) 
Internal error processing get_slice
java.lang.RuntimeException: error reading key 
0000000000000000000000000000000000849706
     at 
org.apache.cassandra.service.StorageProxy.weakReadRemote(StorageProxy.java:265)
     at 
org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:312)
     at 
org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:95)
     at 
org.apache.cassandra.service.CassandraServer.getSlice(CassandraServer.java:177)
     at 
org.apache.cassandra.service.CassandraServer.multigetSliceInternal(CassandraServer.java:252)
     at 
org.apache.cassandra.service.CassandraServer.get_slice(CassandraServer.java:215)
     at 
org.apache.cassandra.service.Cassandra$Processor$get_slice.process(Cassandra.java:671)
     at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:627)
     at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
     at java.lang.Thread.run(Thread.java:636)
Caused by: java.util.concurrent.TimeoutException: Operation timed out.
     at org.apache.cassandra.net.AsyncResult.get(AsyncResult.java:97)
     at 
org.apache.cassandra.service.StorageProxy.weakReadRemote(StorageProxy.java:261)
     ... 11 more

Log's fragment from 10.2.3.40:
DEBUG [ROW-READ-STAGE:4] 2009-10-15 14:18:16,308 ReadVerbHandler.java (line 100) 
Read key 0000000000000000000000000000000000849706; sending response to 
207911@10.3.2.38:7000
....
DEBUG [CONSISTENCY-MANAGER:2] 2009-10-15 14:18:16,308 ConsistencyManager.java 
(line 168) Reading consistency digest for 
0000000000000000000000000000000000849706 from 527021@[10.3.2.39:7000, 
10.3.2.41:7000]

I have full logs, but they are about half of gigabyte for each node. If it's 
needed I can put them somewhere accessible by http.

How to reproduce:
- configure cluster for 4 nodes, changes in storage-conf.xml:
   <ReplicationFactor>3</ReplicationFactor>
   <FlushMinThreads>8</FlushMinThreads>
   <FlushMaxThreads>16</FlushMaxThreads>
- edit attached scripts with correct node's IPs
- run  perl writecluster.pl -c 8 and wait for 10-20 minutes
- run  perl readcluster.pl
- look at error :)

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Mime
View raw message