cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Podkowinski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12888) Incremental repairs broken for MVs and CDC
Date Sun, 05 Mar 2017 15:45:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896342#comment-15896342
] 

Stefan Podkowinski commented on CASSANDRA-12888:
------------------------------------------------

The repairedAt value stored in each sstable's metadata will indicate the time the sstable
has been repaired and nothing more. The basic idea behind tracking such a timestamp value
was that once a sstable has been repaired, the containing data is consistent in a way that
no node would miss any data such as tombstones and therefore we won't have to repair this
data ever again. This is what makes incremental repairs possible. As simple as the idea is,
things start to become a bit tricky when we want to merge data, either by compactions or in
case of this ticket, by applying mutations. The way compactions have been implemented is that
we now have two pools of sstables that will be compacted independently from each other: unrepaired
and repaired data. Sstables in both pools can be compacted together just fine and in case
of repaired data, the lowest timestamp of the compaction candidates will be used as output.
However, the actual timestamp value currently doesn't really matter, as we just use it to
track if it the sstables has been repaired or not. Future repairs may be executed based on
unrepaired only (incremental) or both unrepaired and repaired (full) data. Does this answer
your question?

> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
>                 Key: CASSANDRA-12888
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Stefan Podkowinski
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.0.x, 3.11.x
>
>
> SSTables streamed during the repair process will first be written locally and afterwards
either simply added to the pool of existing sstables or, in case of existing MVs or active
CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we must put all
partitions through the same write path as normal mutations. This also ensures any 2is are
also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through the CommitLog
so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental repairs, as we
loose the {{repaired_at}} state in the process. Eventually the streamed rows will end up in
the unrepaired set, in contrast to the rows on the sender site moved to the repaired set.
The next repair run will stream the same data back again, causing rows to bounce on and on
between nodes on each repair.
> See linked dtest on steps to reproduce. An example for reproducing this manually using
ccm can be found [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message