cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Branson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data
Date Thu, 06 Mar 2014 23:22:43 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923245#comment-13923245
] 

Rick Branson commented on CASSANDRA-6285:
-----------------------------------------

Unfortunately this hit us during our 2.0.5 upgrade and 'sync' is not an option for the # of
connections we have per node (tried this). We've been running Marcus' patch in prod and limping
along on it, but it looks like the requestInvoke() override is causing the requests to get
executed on the selector pool (which is limited by CPU #) instead of the executor service
so our response times are pretty bad. The lack of anything showing up in the JMX for the executor
service definitely points towards this. 

> 2.0 HSHA server introduces corrupt data
> ---------------------------------------
>
>                 Key: CASSANDRA-6285
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>            Reporter: David Sauer
>            Assignee: Pavel Yaskevich
>            Priority: Critical
>             Fix For: 2.0.6
>
>         Attachments: 6285_testnotes1.txt, CASSANDRA-6285-disruptor-heap.patch, compaction_test.py
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other none OpsCenter-Table
got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java (line 187)
Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key DecoratedKey(1326283851463420237, 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>= current key DecoratedKey(954210699457429663, 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
writing into /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
> 	at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
> 	at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
> 	at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
> 	at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> 	at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> 	at org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message