incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Molinaro <>
Subject Data Corruption Problem with cassandra 0.4.0
Date Wed, 30 Sep 2009 18:24:15 GMT

I'm not getting any responses on IRC, so figured I'd put this out on
the mailing list.

I had a 3 node cassandra cluster, replication factor 3 on
3 ec2 m1.large instances behind an haproxy.  I restarted one
of the node to test out some modified sysctl's (tcp stack tuning).
As soon as I restarted it the other 2 nodes started spiking memory
use and the first node seemed to have corrupted data.  The corruption
is an exception when I read some and only some keys.

The exception is

ERROR [pool-1-thread-1] 2009-09-30 17:50:30,037 (line 679) Internal error processing
        at org.apache.cassandra.service.CassandraServer.readColumnFamily(
        at org.apache.cassandra.service.CassandraServer.getSlice(
        at org.apache.cassandra.service.CassandraServer.multigetSliceInternal(
        at org.apache.cassandra.service.CassandraServer.get_slice(
        at org.apache.cassandra.service.Cassandra$Processor$get_slice.process(
        at org.apache.cassandra.service.Cassandra$Processor.process( 
        at org.apache.thrift.server.TThreadPoolServer$
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
        at java.util.concurrent.ThreadPoolExecutor$
Caused by:
        at org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.<init>(
        at org.apache.cassandra.db.filter.SSTableSliceIterator.<init>(
        at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
        at org.apache.cassandra.db.Table.getRow(
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(
        at org.apache.cassandra.service.StorageProxy.weakReadLocal(
        at org.apache.cassandra.service.StorageProxy.readProtocol(
        at org.apache.cassandra.service.CassandraServer.readColumnFamily(
        ... 9 more

I ended up having to fire up some new instances, and reload the data
(luckily this is my small instance which I can reload quickly, I've got a 
large cassandra cluster currently loading which I will not be 
able to do this with, so I'm a little scared about that cluster).

Anyway, any ideas?  I've left the broken cluster so I can investigate/patch/etc.


Anthony Molinaro                           <>

View raw message