cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12620) Resurrected empty rows on update to 3.x
Date Thu, 15 Dec 2016 10:47:58 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751054#comment-15751054
] 

Benjamin Lerer commented on CASSANDRA-12620:
--------------------------------------------

Sorry, for the delay. It tooks me some time and a certain amount of back and forth between
the {{2.1}} and {{3.0}} code base to fully understand the problem.
[~bric3] Thanks for the SSTables they helped a lot.

In {{2.1}} when a row is created with a {{TTL}} the row marker cell is created as an {{ExpiringCell}}
but when the row is compacted if the {{TTL}} has expired the cell will be converted into a
{{DeletedCell}}.

In {{3.0}} the code used to read the post 3.0 format was only expecting an {{ExpiringCell}}
for rows with {{TTL}} and was ignoring the local deletion time of the {{DeletedCell}}. Due
to that the row was not marked as deleted anymore.

As the problem was in the code used to deserialize the post 3.0 format it was not necessary
to run {{upgradesstables}} to see the problem.

I originally could not reproduce the problem because I did not think of running a compaction
before upgrading (SizeTieredCompaction).

I pushed a patch [here|https://github.com/apache/cassandra/compare/trunk...blerer:12620-3.0?expand=1].

   



> Resurrected empty rows on update to 3.x
> ---------------------------------------
>
>                 Key: CASSANDRA-12620
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12620
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Collin Sauve
>            Assignee: Benjamin Lerer
>
> We had the below table on C* 2.x (dse 4.8.4, we assume was 2.1.15.1423 according to documentation),
and were entering TTLs at write-time using the DataStax C# Driver (using the POCO mapper).
> Upon upgrade to 3.0.8.1293 (DSE 5.0.2), we are seeing a lot of rows that:
> * should have been TTL'd
> * have no non-primary-key column data
> {code}
> CREATE TABLE applicationservices.aggregate_bucket_event_v3 (
>     bucket_type int,
>     bucket_id text,
>     date timestamp,
>     aggregate_id text,
>     event_type int,
>     event_id text,
>     entities list<frozen<tuple<int, text>>>,
>     identity_sid text,
>     PRIMARY KEY ((bucket_type, bucket_id), date, aggregate_id, event_type, event_id)
> ) WITH CLUSTERING ORDER BY (date DESC, aggregate_id ASC, event_type ASC, event_id ASC)
>     AND bloom_filter_fp_chance = 0.1
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND comment = ''
>     AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
>     AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
> {code}
> {code}
> {
>     "partition" : {
>       "key" : [ "0", "26492" ],
>       "position" : 54397932
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 54397961,
>         "clustering" : [ "2016-09-07 23:33Z", "3651664", "0", "773665449947099136" ],
>         "liveness_info" : { "tstamp" : "2016-09-07T23:34:09.758Z", "ttl" : 172741, "expires_at"
: "2016-09-09T23:33:10Z", "expired" : false },
>         "cells" : [
>           { "name" : "identity_sid", "value" : "p_tw_zahidana" },
>           { "name" : "entities", "deletion_info" : { "marked_deleted" : "2016-09-07T23:34:09.757999Z",
"local_delete_time" : "2016-09-07T23:34:09Z" } },
>           { "name" : "entities", "path" : [ "936e17e1-7553-11e6-9b92-29a33b5827c3" ],
"value" : "0:https\\://www.youtube.com/watch?v=pwAJAssv6As" },
>           { "name" : "entities", "path" : [ "936e17e2-7553-11e6-9b92-29a33b5827c3" ],
"value" : "2:youtube" }
>         ]
>       },
>       {
>         "type" : "row",
>    },
>       {
>         "type" : "row",
>         "position" : 54397177,
>         "clustering" : [ "2016-08-17 10:00Z", "6387376", "0", "765850666296225792" ],
>         "liveness_info" : { "tstamp" : "2016-08-17T11:26:15.917001Z" },
>         "cells" : [ ]
>       },
>       {
>         "type" : "row",
>         "position" : 54397227,
>         "clustering" : [ "2016-08-17 07:00Z", "6387376", "0", "765805367347601409" ],
>         "liveness_info" : { "tstamp" : "2016-08-17T08:11:17.587Z" },
>         "cells" : [ ]
>       },
>       {
>         "type" : "row",
>         "position" : 54397276,
>         "clustering" : [ "2016-08-17 04:00Z", "6387376", "0", "765760069858365441" ],
>         "liveness_info" : { "tstamp" : "2016-08-17T05:58:11.228Z" },
>         "cells" : [ ]
>       },
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message