cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11886) Streaming will miss sections for early opened sstables during compaction
Date Wed, 01 Jun 2016 10:48:03 GMT


Benedict commented on CASSANDRA-11886:

So, just to clarify for readers, as far as I can tell the problem is _not_ the early open
files themselves, but the shifted starts of the sstables we're replacing - which aren't a
problem if we intersect _at all_ with their shifted instances, but if we move the starts so
we no longer notice them, we don't swap in the canonical instances.

This patch looks like it will fix the issue to me, but if we're at all worried about the algorithmic
complexity, we _should_ be able to use the OverlapIterator (that was introduced in 3.0 I think)
to improve it (possibly with some minor modification).

> Streaming will miss sections for early opened sstables during compaction
> ------------------------------------------------------------------------
>                 Key: CASSANDRA-11886
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefan Podkowinski
>            Assignee: Marcus Eriksson
>            Priority: Critical
>              Labels: correctness, repair, streaming
>         Attachments: 9700-test-2_1.patch
> Once validation compaction has been finished, all mismatching sstable sections for a
token range will be used for streaming as return by {{StreamSession.getSSTableSectionsForRanges}}.
Currently 2.1 will try to restrict the sstable candidates by checking if they can be found
in {{CANONICAL_SSTABLES}} and will ignore them otherwise. At the same time {{IntervalTree}}
in the {{DataTracker}} will be build based on replaced non-canonical sstables as well. In
case of early opened sstables this becomes a problem, as the tree will be update with {{OpenReason.EARLY}}
replacements that cannot be found in canonical. But whenever {{getSSTableSectionsForRanges}}
will get a early instance from the view, it will fail to retrieve the corresponding canonical
version from the map, as the different generation will cause a hashcode mismatch. Please find
a test attached.
> As a consequence not all sections for a range are streamed. In our case this has caused
deleted data to reappear, as sections holding tombstones were left out due to this behavior.

This message was sent by Atlassian JIRA

View raw message