Thanks Aaron, Ive added to the ticket. We were not running on TRACE logging.
Thanks. 

The only work around I can think of is using nodetool scrub. That will read the -Data.db file and re-write it and the other components. 

Remember to snapshot first for roll back. 


Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/09/2013, at 10:43 PM, Christopher Wirt <chris.wirt@struq.com> wrote:

Thanks Aaron, Ive added to the ticket. We were not running on TRACE logging.
 
From: Aaron Morton [mailto:aaron@thelastpickle.com] 
Sent: 30 September 2013 08:37
To: user@cassandra.apache.org
Subject: Re: 2.0.1 counter replicate on write error
 
ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java (line 185) Exception in thread Thread[ReplicateOnWriteStage:19,5,main]
java.lang.AssertionError: DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) != DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831953-Data.db
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:114)
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:62)
 
When reading from an SSTable the position returned from the -Index.db / KEYS cache pointed to a row in the -Data.db component that was for a different row. 
 
DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0)
Was what we were searching for
 
DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6)
Is what was found in the data component. 
 
The first part is the Token (M3 hash) the second is the key. It looks like a collision, but it could also be a bug somewhere else. 
 
Code in SSTableReader.getPosition() points to https://issues.apache.org/jira/browse/CASSANDRA-4687 and adds an assertion that is only trigger if TRACE logging is running. Can you add to the 4687 ticket and update the thread ? 
 
Cheers
 
-----------------
Aaron Morton
New Zealand
@aaronmorton
 
Co-Founder & Principal Consultant
Apache Cassandra Consulting
 
On 27/09/2013, at 10:50 PM, Christopher Wirt <chris.wirt@struq.com> wrote:


Hello,
 
Ive started to see a slightly worrying error appear in our logs occasionally. Were writing at 400qps per machine and I only see this appear every 5-10minutes.
 
Seems to have started when I switched us to using the hsha thrift server this morning. Weve been running 2.0.1 ran off the sync thrift server since yesterday without seeing this error.  But might not be related.
 
There are some machines in another DC still running 1.2.10.
 
Anyone seen this before? Have any insight?
 
ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java (line 185) Exception in thread Thread[ReplicateOnWriteStage:19,5,main]
java.lang.AssertionError: DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) != DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831953-Data.db
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:114)
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:62)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:87)
        at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
        at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)
        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294)
        at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332)
        at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:55)
        at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:100)
        at org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1107)
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1897)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)