cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13069) Local batchlog for MV may not be correctly written on node movements
Date Thu, 24 Aug 2017 09:14:00 GMT


Paulo Motta commented on CASSANDRA-13069:

Finally getting back to this after a while, sorry for the delay! Going back to the question
of the last review round:

bq. As far as I can tell, the goal of using a local batchlog is to guarantee eventual consistency
of a the base table and its views. That is, no matter what happens for a given update, either
both that update and all the related view updates get eventually applied, or none of it is.
So I don't understand why:
1. we don't include local view mutations in the batchlog in SP.mutateMV.
2. the base table mutation isn't included in said batchlog alongside it's related view updates.

I ended up writing a [trunk patch|]
to include both local and base table mutations in the batchlog as suggested, but then looking
at the original code I figured that whatever failure happens during views update (but before
the local base or views are persisted) is safeguarded by the [base tablecommit log write|]
prior to the view update, so I don't think we actually need to include the local mutations
in the batchlog.

Given that remote view writes are [fire and forget|],
 the most probable cause of failure during local view writing would be a crash, and in that
case the commit log replay will re-apply the base mutations and views. The two downsides I
can think of are:
a) We cannot ensure the commit log is actually persisted before the crash, unless batch commit
log is used.
b) The user may have durable_writes=false
c) I'm not sure if many other non-crash failures are possible in this case (fail local base
and view mutations), besides full/corrupted disk, in which case you are screwed anyway, but
if they happen you'll need to wait until the next restart/commitlog replay to have your base-views

There's not much we can do about a), so it seems we'll just need to live with this and rely
on repair to fix potential inconsistencies? b) is actually a problem even with including local
mutations in the batchlog write, unless we force batchlog writes to be durable. c) is not
a big deal unless there are other legitimate non-crash scenarios which I'm not aware of.

Adding the local base mutation to the view batchlog requires the following changes:
a) Special case the batchlog write-path to [skip writing the base mutation|],
since that will be written by the calling thread after the view updates, and [ack it|]
after the base table write is done so the batchlog can be clean.
b) Special case the batchlog [replay path|]
to avoid generating views when replaying base table mutations in the case of view batchlogs.

So, unless there's a good reason not mentioned above to include the local mutations on the
view batchlog, I'd prefer to keep the current approach of writing only remote view mutations
in the view batchlog. If you agree with that, I [added a comment|]
in the original patch explaining why the local mutations are not included in the local batchlog
to avoid confusion in the future. Please find the updated patch below:


I've added 2 [dtests|]
to check that base and view are consistent on crash before and after the view is applied,
which detected that recovered batchlog from commitlogs is not replayed straight away due to
the batchlog timeout, so I removed that wait on the [first replay|].
I also updated the MV test to verify that the view batchlog is [empty after replay|].

I did a few simplifications in the MV and batchlog write path in the [other patch|]
which may be worth keeping, so if we decide keeping the current approach of skipping local
mutations on the batchlog I will open a new ticket with the refactoring suggestions.

> Local batchlog for MV may not be correctly written on node movements
> --------------------------------------------------------------------
>                 Key: CASSANDRA-13069
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Materialized Views
>            Reporter: Sylvain Lebresne
>            Assignee: Paulo Motta
> Unless I'm really reading this wrong, I think the code [here|],
which comes from CASSANDRA-10674, isn't working properly.
> More precisely, I believe we can have both paired and unpaired mutations, so that both
{{if}} can be taken, but if that's the case, the 2nd write to the batchlog will basically
overwrite (remove) the batchlog write of the 1st {{if}} and I don't think that's the intention.
In practice, this means "paired" mutation won't be in the batchlog, which mean they won't
be replayed at all if they fail.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message