cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12423) Cells missing from compact storage table after upgrading from 2.1.9 to 3.7
Date Tue, 23 Aug 2016 22:52:21 GMT


Tyler Hobbs commented on CASSANDRA-12423:

Regarding the patch, it took me a while to convince myself this was the correct behavior,
so I'll write out some comments for posterity.

In the 2.x world, a range tombstone with a final EOC of 0 results in a exclusive deletion
_except_ for cells that exactly match the tombstone bound.  I made a [dtest|]
to double check this (which we should commit, I guess).  This is why we can't just use {{EXCL_END_BOUND}}
in 3.0 -- in the case where the range tombstone bound exactly matches a {{ClusteringPrefix}},
we want that to be included in the deletion.

I think we should change the new enum name from {{INCL_END_BOUND_EOC_0}} to {{END_BOUND_EOC_0}}
for clarity.  It's sort of both inclusive and exclusive, so having either one in the name
is confusing, I think.  Regarding the code comment:

bq.  This value compares as an inclusive end bound but before any clustering values. It can
be removed once support for thrift and legacy sstables is removed.

I find the first half of that confusing.  Maybe rephrase it as "This value is inclusive of
any exact matches, but does not include any clusterings that it is merely a prefix of."  I
believe the second half of the comment is incorrect, as we'll need to continue treating those
range tombstones correctly as long as they exist.  Furthermore, once we've serialized tombstones
with that enum value, we'll always need it for deserialization.

My last concern is around compatibility in a mixed-version cluster.  It seems like there could
be problems if you did something like this:
* Have a 2.x cluster with range tombstones with an eoc=0 end bound
* Upgrade to some 3.x that doesn't include this fix
* Then, upgrade some nodes to a 3.y that _does_ include this fix while there are still legacy

It seems like the 3.y nodes could send results with the new enum value to the 3.x nodes which
would then error while deserializing the results.  I'm not sure what to do about that yet.
 We could require all sstables to be upgraded before upgrading to 3.10 from 3.x, but that
would also force any remaining legacy tombstones like this to be permanently treated incorrectly,
which kind of sucks.

> Cells missing from compact storage table after upgrading from 2.1.9 to 3.7
> --------------------------------------------------------------------------
>                 Key: CASSANDRA-12423
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Tomasz Grabiec
>            Assignee: Stefania
>         Attachments: 12423.tar.gz
> Schema:
> {code}
> create table ks1.test ( id int, c1 text, c2 text, v int, primary key (id, c1, c2)) with
compact storage and compression = {'sstable_compression': ''};
> {code}
> sstable2json before upgrading:
> {code}
> [
> {"key": "1",
>  "cells": [["","0",1470761440040513],
>            ["a","asd",2470761440040513,"t",1470764842],
>            ["asd:","0",1470761451368658],
>            ["asd:asd","0",1470761449416613]]}
> ]
> {code}
> Query result with 2.1.9:
> {code}
> cqlsh> select * from ks1.test;
>  id | c1  | c2   | v
> ----+-----+------+---
>   1 |     | null | 0
>   1 | asd |      | 0
>   1 | asd |  asd | 0
> (3 rows)
> {code}
> Query result with 3.7:
> {code}
> cqlsh> select * from ks1.test;
>  id | 6331 | 6332 | v
> ----+------+------+---
>   1 |      | null | 0
> (1 rows)
> {code}

This message was sent by Atlassian JIRA

View raw message