cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-8683) Ensure early reopening has no overlap with replaced files
Date Mon, 09 Feb 2015 08:57:34 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict updated CASSANDRA-8683:
--------------------------------
    Description: 
When introducing CASSANDRA-6916 we permitted the early opened files to overlap with the files
they were replacing by one DecoratedKey, as this permitted a few minor simplifications. Unfortunately
this breaks assumptions in LeveledCompactionScanner, that are causing the intermittent unit
test failures: http://cassci.datastax.com/job/trunk_utest/1330/testReport/junit/org.apache.cassandra.db.compaction/LeveledCompactionStrategyTest/testValidationMultipleSSTablePerLevel/

This patch by itself does not fix the bug, but fixes the described aspect of it, by ensuring
the replaced and replacing files never overlap. This is achieved first by always selecting
the replaced file start as the next key present in the file greater than the last key in the
new file(s).  If there is no such key, however, there is no data to return for the reader,
but to permit abort and atomic replacement at the end of a macro compaction action, we must
keep the file in the DataTracker for replacement purposes, but not return it to consumers
(esp. as many assume a non-empty range). For this I have introduced a new OpenReason called
SHADOWED, and a DataTracker.View.shadowed collection of sstables, that tracks those we still
consider to be in the live set, but from which we no longer answer any queries.

CASSANDRA-8744 (and then CASSANDRA-8750) then ensures that these bounds are honoured, so that
we never break the assumption that files in LCS never overlap.

  was:
Incremental repairs holds a set of the sstables it started the repair on (we need to know
which sstables were actually validated to be able to anticompact them). This includes any
tmplink files that existed when the compaction started (if we wouldn't include those, we would
miss data since we move the start point of the existing non-tmplink files)

With CASSANDRA-6916 we swap out those instances with new ones (SSTR.cloneWithNewStart / SSTW.openEarly),
meaning that the underlying file can get deleted even though we hold a reference.

This causes the unit test error: http://cassci.datastax.com/job/trunk_utest/1330/testReport/junit/org.apache.cassandra.db.compaction/LeveledCompactionStrategyTest/testValidationMultipleSSTablePerLevel/

(note that it only fails on trunk though, in 2.1 we don't hold references to the repairing
files for non-incremental repairs, but the bug should exist in 2.1 as well)


> Ensure early reopening has no overlap with replaced files
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-8683
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8683
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 2.1.3
>
>         Attachments: 0001-avoid-NPE-in-getPositionsForRanges.patch
>
>
> When introducing CASSANDRA-6916 we permitted the early opened files to overlap with the
files they were replacing by one DecoratedKey, as this permitted a few minor simplifications.
Unfortunately this breaks assumptions in LeveledCompactionScanner, that are causing the intermittent
unit test failures: http://cassci.datastax.com/job/trunk_utest/1330/testReport/junit/org.apache.cassandra.db.compaction/LeveledCompactionStrategyTest/testValidationMultipleSSTablePerLevel/
> This patch by itself does not fix the bug, but fixes the described aspect of it, by ensuring
the replaced and replacing files never overlap. This is achieved first by always selecting
the replaced file start as the next key present in the file greater than the last key in the
new file(s).  If there is no such key, however, there is no data to return for the reader,
but to permit abort and atomic replacement at the end of a macro compaction action, we must
keep the file in the DataTracker for replacement purposes, but not return it to consumers
(esp. as many assume a non-empty range). For this I have introduced a new OpenReason called
SHADOWED, and a DataTracker.View.shadowed collection of sstables, that tracks those we still
consider to be in the live set, but from which we no longer answer any queries.
> CASSANDRA-8744 (and then CASSANDRA-8750) then ensures that these bounds are honoured,
so that we never break the assumption that files in LCS never overlap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message