cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9669) If sstable flushes complete out of order, on restart we can fail to replay necessary commit log records
Date Fri, 13 May 2016 16:10:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282844#comment-15282844
] 

Jeremiah Jordan edited comment on CASSANDRA-9669 at 5/13/16 4:10 PM:
---------------------------------------------------------------------

This seems to have broken something.  A bunch of test we have started failing to truncate
things with timeouts, and a bunch of threads blocked like so:

{code}
"SharedPool-Worker-15" 
   java.lang.Thread.State: BLOCKED
        at org.apache.cassandra.db.ColumnFamilyStore.truncateBlocking(ColumnFamilyStore.java:1973)
        at org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:40)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
        at java.lang.Thread.run(Thread.java:745)
{code}

On cassandra-3.0@78a3d2bba95b9efcda152a157f822f4970f22636


was (Author: jjordan):
This seems to have broken something.  A bunch of test we have started failing to truncate
things with timeouts, and a bunch of threads blocked like so:

{code}
"SharedPool-Worker-15" 
   java.lang.Thread.State: BLOCKED
        at org.apache.cassandra.db.ColumnFamilyStore.truncateBlocking(ColumnFamilyStore.java:1973)
        at org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:40)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
        at java.lang.Thread.run(Thread.java:745)
{code}

> If sstable flushes complete out of order, on restart we can fail to replay necessary
commit log records
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9669
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Critical
>              Labels: correctness
>             Fix For: 2.2.7, 3.7, 3.0.7
>
>
> While {{postFlushExecutor}} ensures it never expires CL entries out-of-order, on restart
we simply take the maximum replay position of any sstable on disk, and ignore anything prior.

> It is quite possible for there to be two flushes triggered for a given table, and for
the second to finish first by virtue of containing a much smaller quantity of live data (or
perhaps the disk is just under less pressure). If we crash before the first sstable has been
written, then on restart the data it would have represented will disappear, since we will
not replay the CL records.
> This looks to be a bug present since time immemorial, and also seems pretty serious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message