cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12888) Incremental repairs broken for MVs and CDC
Date Fri, 03 Mar 2017 16:13:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894639#comment-15894639
] 

Benjamin Roth commented on CASSANDRA-12888:
-------------------------------------------

A repair must go through the write path expect for some special cases. I also first had the
idea to avoid it completely but in discussion with [~pauloricardomg] it turned out that this
may introduce inconsistencies that these could only be fixed by a view rebuild because it
leaves stale rows.
I know that all this stuff is totally counter-intuitive but just streaming "blindly" all sstables
(incl. MV tables) down is not correct. This is why I am trying to improve the mutation based
approached.

If the Sstables for MVs get corrupted or lost, the only way to fix it is to rebuild that view
again. There is no way (at least none I see atm) that would consistenly repair a view from
other nodes.

The underlying principle is:
- A view must always be consistent to its base-table
- A view does not have to be consistent among nodes, thats handled by repairing the base table

Thats also why you don't have to run a repair before building a view. Nevertheless it would
not help anyway because you NEVER have a 100% guaranteed consistent state. A repair only guarantees
consistency until the point of repair.

The "know what you are doing" option is offered by CASSANDRA-13066 btw. 
In this ticket I also adopted the election of CFs (tables + mvs) when doing a keyspace repair
depending if the MV is repaired by stream or by mutation.

> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
>                 Key: CASSANDRA-12888
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Stefan Podkowinski
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.0.x, 3.11.x
>
>
> SSTables streamed during the repair process will first be written locally and afterwards
either simply added to the pool of existing sstables or, in case of existing MVs or active
CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we must put all
partitions through the same write path as normal mutations. This also ensures any 2is are
also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through the CommitLog
so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental repairs, as we
loose the {{repaired_at}} state in the process. Eventually the streamed rows will end up in
the unrepaired set, in contrast to the rows on the sender site moved to the repaired set.
The next repair run will stream the same data back again, causing rows to bounce on and on
between nodes on each repair.
> See linked dtest on steps to reproduce. An example for reproducing this manually using
ccm can be found [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message