cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Shang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data
Date Fri, 07 Mar 2014 20:54:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924330#comment-13924330
] 

Miles Shang commented on CASSANDRA-6285:
----------------------------------------

To add to [~rbranson]'s input, we're also seeing the same stacktrace as [~mshuler] (TimeUUID
MarshalException). I inspected the row mutations that caused it. Three ranges were nonsensical:
the key, the column name, and the value. By nonsensical, I mean that they don't match my expectation
of what we are inserting in production data. All other ranges seemed fine (timestamps, masks,
sizes, cfid). The key, column name, and value were read successfully, so their length metadata
was good. For our data, the column comparator is TimeUUID. Our client library is pycassa.
Whereas pycassa generates tuuids like this: 913d7fea-a631-11e3-8080-808080808080, the nonsensical
column names look like this: 22050aa4-de11-e380-8080-80808080800b and this: 10c326eb-86a4-e211-e380-808080808080.
Most are of the first form. By shifting these nonsensical tuuids to the left or right by 2
octets, you get a reasonable tuuid. I don't have a similar insight into the nonsensical keys
and values, but they could also be left or right shifted.

> 2.0 HSHA server introduces corrupt data
> ---------------------------------------
>
>                 Key: CASSANDRA-6285
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>            Reporter: David Sauer
>            Assignee: Pavel Yaskevich
>            Priority: Critical
>             Fix For: 2.0.6
>
>         Attachments: 6285_testnotes1.txt, CASSANDRA-6285-disruptor-heap.patch, compaction_test.py
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other none OpsCenter-Table
got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java (line 187)
Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key DecoratedKey(1326283851463420237, 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>= current key DecoratedKey(954210699457429663, 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
writing into /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
> 	at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
> 	at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
> 	at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
> 	at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> 	at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> 	at org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message