cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nadav Har'El (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-14262) View update sent multiple times during range movement
Date Mon, 26 Feb 2018 13:37:00 GMT
Nadav Har'El created CASSANDRA-14262:

             Summary: View update sent multiple times during range movement
                 Key: CASSANDRA-14262
             Project: Cassandra
          Issue Type: Improvement
          Components: Materialized Views
            Reporter: Nadav Har'El

This issue is about updating a base table with materialized views while token-ranges are being
moved, i.e., while a node is being added or removed from the cluster (this is a long process
because the data needs to be streamed to its new owning node).

During this process, each view-mutation we want to write to a view table may have an additional
"pending node" (or several of them) - another node (or nodes) which will hold this view mutation,
and we need to send the view mutations to these new nodes too. This code existed until CASSANDRA-13069,
when it was accidentally removed, and returned in CASSANDRA-14251.

However, the current code, in mutateMV(), has each of the RF (e.g., 3) base replicas send
the view mutation to the the same pending node. This is of course redundant, and reduces write
throughput while the streaming is performed.

I suggested (based on an idea by [~shlomi_livne]) that it may be enough for only the single
node which will be paired (when the range movement completes) with the pending node to send
it the update. [~pauloricardomg] replied (see []
) that it appears that such an optimization would work in the common case of single movements
but will not work in rarer more complex cases (I did not fully understand the details, check
out the above link for the details).

I believe there's another problem with the current code, which is of correctness: If any view
replica ends up with two different view rows for the same partition key, such a mistake cannot
currently be fixed (see CASSANDRA-10346). But if we have different base replicas with two
different values (a consistency an ordinary base repair could fix, if we ran it) and both
of them send their update to the same pending view replica, this view replica will now have
two rows, one of them wrong (and cannot currently be repaired).




This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message