cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7437) Ensure writes have completed after dropping a table, before recycling commit log segments (CASSANDRA-7437)
Date Thu, 21 Aug 2014 06:40:30 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105081#comment-14105081
] 

Benedict edited comment on CASSANDRA-7437 at 8/21/14 6:38 AM:
--------------------------------------------------------------

In hindsight, this was pretty obvious.

We wait for modifications to complete to the commit log segment before force recycling, but
we don't ensure those modifications have hit memtables before flushing them to mark the segment
clean.

Patch attached that gets the keyspaces with records in the segment and waits for any current
writes to complete before flushing, and includes a new long test to check this works as advertised

It may be worth mentioning that we did in fact wait for these modifications already to the
dropped table, so the error would not have caused commit logs to keep collecting; the problem
is with sstables from _other keyspaces_ in the commit log (in this case the system keyspace),
which would be cleared on the next flush.


was (Author: benedict):
In hindsight, this was pretty obvious.

We wait for modifications to complete to the commit log segment before force recycling, but
we don't ensure those modifications have hit memtables before flushing them to mark the segment
clean.

Patch attached that gets the keyspaces with records in the segment and waits for any current
writes to complete before flushing, and includes a new long test to check this works as advertised

>  Ensure writes have completed after dropping a table, before recycling commit log segments
(CASSANDRA-7437)
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7437
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7437
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>             Fix For: 2.1.0
>
>         Attachments: 7437.log, 7437.round2.txt, 7437_test.py
>
>
> I've noticed on unit test output that there are still assertions being raised here, so
I've taken a torch to the code path to make damned certain it cannot happen in future 
> # We now wait for all running reads on a column family or writes on the keyspace during
a dropCf call
> # We wait for all appends to the prior commit log segments before recycling them
> # We pass the list of dropped Cfs into the CL.forceRecycle call so that they can be markedClean
definitely after they have been marked finished
> # Finally, to prevent any possibility of this still happening causing any negative consequences,
I've suppressed the assertion in favour of an error log message, as the assertion would break
correct program flow for the drop and potentially result in undefined behaviour
> -(in actuality there is the slightest possibility still of a race condition on read of
a secondary index that causes a repair driven write, but this is a really tiny race window,
as I force wait for all reads after unlinking the CF, so it would have to be a read that grabbed
the CFS reference before it was dropped, but hadn't quite started its read op yet).- In fact
this is also safe, as these modifications all grab a write op from the Keyspace, which has
to happen before they get the CFS, and also because we drop the data before waiting for reads
to finish on the CFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message