cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Morton (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty
Date Thu, 21 Jul 2011 13:01:58 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aaron Morton updated CASSANDRA-2829:
------------------------------------

    Attachment: 0002-2829-v08.patch
                0001-2829-unit-test-v08.patch

I got to take another look at this tonight on the 0.8 trunk and ported the unit test to 0.8.


The 002-2829-v08 patch was my second attempt. It changes CFS.forceFlush() to always flush
and trusts maybeSwitchMemtable() will only flush non clean CF's. 

There are no changes to  CommitLog.discardCompletedSegmentsInternal(). The CF will be turned
off in any segment that is not the context segment. It will always be turned on in the current
/ context segment. I think this gives the correct behaviour, i.e. the cf can never have dirty
changes in a segment that is not current AND the cf may have changes in a segment that is
current. It is a bit sloppy though as clean CF's will mark segments as dirty which may delay
them been cleaned. 


I also think there is a theoretical risk of a race condition with access to the segments Deque.
 The iterator runs in the postFlushExecutor and the segments are added on the appropriate
commit log executor service.



> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch,
0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete
commit log segments. This can result it log segments not been deleted even though the data
has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and
a running application working against the cluster. Did a rolling restart and then kicked off
a repair, one node filled up the commit log volume with 7GB+ of log data, there was about
20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings
which is 24 hours. Will create another ticket see if these can be reduced or if it's something
users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments
had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now,
but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete
(attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started,
CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is
not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken()
was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is
a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message