cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7465) DecoratedKey assertion error on reads
Date Fri, 27 Jun 2014 23:31:25 GMT


Jason Brown commented on CASSANDRA-7465:

My current hypothesis is:

1) SSTNI.M<ctor>(SSTableReader, DecoratedKey, SortedSet) is invoked with value "A" for
the DecoratedKey
2) first thing the ctor does is get the RowIndexEntry from SSTR.getPosition(), which looks
into the keyCache for the DK (more or less). 
2a) If it's a cache miss, we put a new entry into the keyCache, which shares the same row
key ByteBuffer as the key passed into SSTNI's ctor.
2b) If it's a cache hit, we return the RIE as the value of it's key matches the SSTNI's key
(but a different BB)
3) reads key from disk (as pointed to by RIE.position). If the key read from
disk doesn't match the key passed into the ctor, the assertion fails.

Thus, I think reason the assertion fails is that the SSNTI.key is being corrupted after it
is used for the lookup into the keyCache. I'll instrument the code and verify this is correct.

> DecoratedKey assertion error on reads
> -------------------------------------
>                 Key: CASSANDRA-7465
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 3 nodes
> Oracle Linux Server 6.3 
> kernel ver 2.6.39
>            Reporter: Jason Brown
>            Priority: Blocker
>             Fix For: 2.1.0
> Getting the following exception when running read stress:
> {code}WARN  [SharedPool-Worker-31] 2014-06-27 21:25:51,391
- Uncaught exception on thread Thread[SharedPool-Worker-31,10,main]: {}
> java.lang.AssertionError: DecoratedKey(-5397116645141815707, 30303031393143364639) !=
DecoratedKey(-5397116645141815707, 30303031343439443233) in /u/sdd/cassandra-jasobrown/data/Keyspace1/Standard1-6ab9bd90fe3b11e385edff96c2ef2fd6/Keyspace1-Standard1-ka-73-Data.db
> 	at
> 	at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(
> 	at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(
> 	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(
> 	at org.apache.cassandra.db.CollationController.collectTimeOrderedData(
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
> 	at org.apache.cassandra.db.Keyspace.getRow( ~[apache-cassandra-2.1.0-rc2-SNAPSHOT.jar:2.1.0-rc2-SNAPSHOT]
> 	at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(
> 	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(
> 	at org.apache.cassandra.service.StorageProxy$
> 	at java.util.concurrent.Executors$ ~[na:1.7.0_13]
> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$
> 	at [apache-cassandra-2.1.0-rc2-SNAPSHOT.jar:2.1.0-rc2-SNAPSHOT]
> 	at [na:1.7.0_13]
> {code}
> I have a three node cluster, which I populate with the following stress command:
> {code}
> cassandra-stress write n=60000000 -schema replication\(factor\=2\) -key populate=1..60000000
-rate threads=42  -mode native prepared cql3 -port native=9043 thrift=9161  -node athena06-a,athena06-b,athena06-c
-col n=fixed\(21\) size=exp\(11..42\)
> {code}
> Then I run the read stress:
> {code}
> cassandra-stress read n=100000000  -key dist=extr\(1..600000000,2\)  -mode native prepared
cql3 -port native=9043 thrift=9161  -node athena06-b,athena06-c,athena06-a -col n=fixed\(21\)
-rate threads=64
> {code}
> The above exception occurs semi-frequently (several to ~50 times a minute, but seems
to depend on amount of data in cluster - anecdotal evidence only).

This message was sent by Atlassian JIRA

View raw message