Thanks Aaron, I’ve added to the ticket. We were not running on TRACE logging.

 

From: Aaron Morton [mailto:aaron@thelastpickle.com]
Sent: 30 September 2013 08:37
To: user@cassandra.apache.org
Subject: Re: 2.0.1 counter replicate on write error

 

ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java (line 185) Exception in thread Thread[ReplicateOnWriteStage:19,5,main]

java.lang.AssertionError: DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) != DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831953-Data.db

        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:114)

        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:62)

 

When reading from an SSTable the position returned from the -Index.db / KEYS cache pointed to a row in the -Data.db component that was for a different row. 

 

DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0)

Was what we were searching for

 

DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6)

Is what was found in the data component. 

 

The first part is the Token (M3 hash) the second is the key. It looks like a collision, but it could also be a bug somewhere else. 

 

Code in SSTableReader.getPosition() points to https://issues.apache.org/jira/browse/CASSANDRA-4687 and adds an assertion that is only trigger if TRACE logging is running. Can you add to the 4687 ticket and update the thread ? 

 

Cheers

 

-----------------

Aaron Morton

New Zealand

@aaronmorton

 

Co-Founder & Principal Consultant

Apache Cassandra Consulting

http://www.thelastpickle.com

 

On 27/09/2013, at 10:50 PM, Christopher Wirt <chris.wirt@struq.com> wrote:



Hello,

 

I’ve started to see a slightly worrying error appear in our logs occasionally. We’re writing at 400qps per machine and I only see this appear every 5-10minutes.

 

Seems to have started when I switched us to using the hsha thrift server this morning. We’ve been running 2.0.1 ran off the sync thrift server since yesterday without seeing this error.  But might not be related.

 

There are some machines in another DC still running 1.2.10.

 

Anyone seen this before? Have any insight?

 

ERROR [ReplicateOnWriteStage:19] 2013-09-27 10:17:14,778 CassandraDaemon.java (line 185) Exception in thread Thread[ReplicateOnWriteStage:19,5,main]

java.lang.AssertionError: DecoratedKey(-1754949563326053382, a414b0c07f0547f8a75410555716ced6) != DecoratedKey(-1754949563326053382, aeadcec8184445d4ab631ef4250927d0) in /disk3/cassandra/data/struqrealtime/counters/struqrealtime-counters-jb-831953-Data.db

        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:114)

        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:62)

        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:87)

        at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)

        at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249)

        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)

        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468)

        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294)

        at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332)

        at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:55)

        at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:100)

        at org.apache.cassandra.service.StorageProxy$8$1.runMayThrow(StorageProxy.java:1107)

        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1897)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:724)