cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13691) Fix incorrect [2.1 <— 3.0] serialization of counter cells with pre-2.1 local shards
Date Wed, 26 Jul 2017 16:44:00 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksey Yeschenko updated CASSANDRA-13691:
------------------------------------------
    Reviewer: Sam Tunnicliffe

> Fix incorrect [2.1 <— 3.0] serialization of counter cells with pre-2.1 local shards
> -----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13691
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13691
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>              Labels: counters, upgrade
>             Fix For: 3.0.x, 3.11.x
>
>
> We stopped generating local shards in C* 2.1, after CASSANDRA-6504 (Counters 2.0). But
it’s still possible to have counter cell values
> around, remaining from 2.0 times, on 2.1, 3.0, 3.11, and even trunk nodes, if they’ve
never been overwritten.
> In 2.1, we used two classes for two kinds of counter columns:
> {{CounterCell}} class to store counters - internally as collections of {{CounterContext}}
blobs, encoding collections of (host id, count, clock) tuples
> {{CounterUpdateCell}} class to represent unapplied increments - essentially a single
long value; this class was never written to commit log, memtables, or sstables, and was only
used inside {{Mutation}} object graph - in memory, and marshalled over network in cases when
counter write coordinator and counter write leader were different nodes
> 3.0 got rid of {{CounterCell}} and {{CounterUpdateCell}}, among other {{Cell}} classes.
In order to represent these unapplied increments - equivalents of 2.1 {{CounterUpdateCell}}
- in 3.0 we encode them as regular counter columns, with a ‘special’ {{CounterContext}}
value. I.e. a counter context with a single local shard. We do that so that we can reuse local
shard reconcile logic (summing up) to seamlessly support counters with same names collapsing
to single increments in batches. See {{UpdateParameters.addCounter()}} method comments [here|https://github.com/apache/cassandra/blob/cassandra-3.0.14/src/java/org/apache/cassandra/cql3/UpdateParameters.java#L157-L171]
for details. It also assumes that nothing else can generate a counter with local shards.
> It works fine in pure 3.0 clusters, and in mixed 2.1/3.0 clusters, assuming that there
are no counters with legacy local shards remaining from 2.0 era. It breaks down badly if there
are.
> {{LegacyLayout.serializeAsLegacyPartition()}} and consequently {{LegacyCell.isCounterUpdate()}}
- classes responsible for serializing and deserialising in 2.1 format for compatibility -
use the following logic to tell if a cell of {{COUNTER}} kind is a regular final counter or
an unapplied increment:
> {code}
> private boolean isCounterUpdate()
> {
>     // See UpdateParameters.addCounter() for more details on this
>     return isCounter() && CounterContext.instance().isLocal(value);
> }
> {code}
> {{CounterContext.isLocal()}} method here looks at the first shard of the collection of
tuples and returns true if it’s a local one.
> This method would correctly identify a cell generated by {{UpdateParameters.addCounter()}}
as a counter update and serialize it correctly as a 2.1 {{CounterUpdateCell}}. However, it
would also incorrectly flag any regular counter cell that just so happens to have a local
shard as the first tuple of the counter context as a counter update. If a 2.1 node as a coordinator
of a read requests fetches such a value from a 3.0 node, during a rolling upgrade, instead
of the expected {{CounterCell}} object it will receive a {{CounterUpdateCell}}, breaking all
the things. In the best case scenario it will cause an assert in {{AbstractCell.reconcileCounter()}}
to be raised.
> To fix the problem we must find an unambiguous way, without false positives or false
negatives, to represent and identify unapplied counter updates on 3.0 side. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message