cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacek Furmankiewicz (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4687) Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk)
Date Fri, 06 Dec 2013 11:55:55 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841205#comment-13841205
] 

Jacek Furmankiewicz edited comment on CASSANDRA-4687 at 12/6/13 11:55 AM:
--------------------------------------------------------------------------

We are seeing the same issue on 1.1.12 and RHEL 5. Major batch processing which involves lots
reads and writes and runs for a few hours...in the middle of it Cassandra nodes start dying
one by one IN THE ENTIRE CLUSTER (3 nodes).

Sooner or later, the entire cluster is dead in the water. 
We can reproduce this every time.

This is stopping the live date for a major system.

BTW, this BUG IS NOT MINOR!  This is a SHOWSTOPPER issue.
The entire cluster, ALL NODES, go down.

Distributed DB reliability is not supposed to allow something like this to happen.
Only workaround is restarting every single node one by one...in what is supposed to be a 24x7
system that has no right to go down.

Please escalate this to SHOWSTOPPER status.


was (Author: jfurmankiewicz):
We are seeing the same issue on 1.1.12 and RHEL 5. Major batch processing which involves lots
reads and writes and runs for a few hours...in the middle of it Cassandra nodes start dying
one by one IN THE ENTIRE CLUSTER.

Sooner or later, the entire cluster is dead in the water. 
We can reproduce this every time.

This is stopping the live date for a major system.

BTW, this BUG IS NOT MINOR!  This is a SHOWSTOPPER issue.
The entire cluster, ALL NODES, go down.

Distributed DB reliability is not supposed to allow something like this to happen.
Only workaround is restarting every single node one by one...in what is supposed to be a 24x7
system that has no right to go down.

Please escalate this to SHOWSTOPPER status.

> Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk)
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-4687
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4687
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: CentOS 6.3 64-bit, Oracle JRE 1.6.0.33 64-bit, single node cluster
>            Reporter: Leonid Shalupov
>            Priority: Minor
>         Attachments: 4687-debugging.txt
>
>
> Under heavy write load sometimes cassandra fails with assertion error.
> git bisect leads to commit 295aedb278e7a495213241b66bc46d763fd4ce66.
> works fine if global key/row caches disabled in code.
> {quote}
> java.lang.AssertionError: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk) in /var/lib/cassandra/data/...-he-1-Data.db
> 	at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:60)
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67)
> 	at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256)
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
> 	at org.apache.cassandra.db.Table.getRow(Table.java:378)
> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69)
> 	at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:819)
> 	at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1253)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:662)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message