cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
Date Mon, 11 Jul 2016 20:08:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367832#comment-15367832
] 

Alex Petrov edited comment on CASSANDRA-12144 at 7/11/16 8:07 PM:
------------------------------------------------------------------

{{2.x}} storage format doesn't guarantee that there'll be a single range tombstone, or that
tombstones will be in the certain order relative to the cells. Under some circumstances (which
I unfortunately could not reproduce), we were in the situation when we had multiple tombstones,
followed by the row:

{code}
[
{"key": "11111",
 "cells": [["12345:_","12345:!",<timestamp>,"t",<local_deletion_time>], (*1)
           ["12345:_","12345:!",<timestamp>,"t",<local_deletion_time>], (*2)
           ["12345:","",<time>],
           ["12345:c1","xxxxxx",<time>],
           ["12345:c2","yyyyyy",<time>]]}
]
{code}

Which was resulting into two rows: one tombstone made from the {{(*1)}} and second one, live
row made from the tombstone and cells following it (since delete time was that the cells should
be live). 

This was resulting into the two rows in the new storage format after {{sstableupgrade}}. During
the iteration, the first tombstone row was read out, although since second row was also read
out and since the rest of merge iterators (superceeding delete might have been in memtable,
or any other sstable) were exhausted, it was treated as a completely normal live row. Undeletable
since all deletes would only affect the tombstone, whose clustering was matching. 

I've made a patch that captures this edge case.

|[3.0|https://github.com/ifesdjeen/cassandra/tree/12144-3.0] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-testall/]
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-dtest/]
|[upgrade tests|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/upgrade_tests-all-12144-3.0/]|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/12144-trunk] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-testall/]
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-dtest/]
|[upgrade tests|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/upgrade_tests-all-12144-trunk/]|

I'll run the CI and submit patch if it's successful (particularly interested in upgrade dtests).


After talking to [~slebresne], we might have to also provide a fix for the scrub tool that'd
detect and fix such cases.

Very big thanks to [~stanislav] for providing information required to track this issue down.


was (Author: ifesdjeen):
{{2.x}} storage format doesn't guarantee that there'll be a single range tombstone, or that
tombstones will be in the certain order relative to the cells. Under some circumstances (which
I unfortunately could not reproduce), we were in the situation when we had multiple tombstones,
followed by the row:

{code}
[
{"key": "11111",
 "cells": [["12345:_","12345:!",<timestamp>,"t",<local_deletion_time>], (*1)
           ["12345:_","12345:!",<timestamp>,"t",<local_deletion_time>], (*2)
           ["12345:","",<time>],
           ["12345:c1","xxxxxx",<time>],
           ["12345:c2","yyyyyy",<time>]]}
]
{code}

Which was resulting into two rows: one tombstone made from the {{(*1)}} and second one, live
row made from the tombstone and cells following it (since delete time was that the cells should
be live). 

This was resulting into the two rows in the new storage format after {{sstableupgrade}}. During
the iteration, the first tombstone row was read out, although since second row was also read
out and since the rest of merge iterators (superceeding delete might have been in memtable,
or any other sstable) were exhausted, it was treated as a completely normal live row. Undeletable
since all deletes would only affect the tombstone, whose clustering was matching. 

I've made a patch that captures this edge case.

|[3.0|https://github.com/ifesdjeen/cassandra/tree/12144-3.0] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-testall/]
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-3.0-dtest/]
|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/12144-trunk] |[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-testall/]
|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12144-trunk-dtest/]
|

I'll run the CI and submit patch if it's successful (particularly interested in upgrade dtests).


After talking to [~slebresne], we might have to also provide a fix for the scrub tool that'd
detect and fix such cases.

Very big thanks to [~stanislav] for providing information required to track this issue down.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> ----------------------------------------------------
>
>                 Key: CASSANDRA-12144
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stanislav Vishnevskiy
>            Assignee: Alex Petrov
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is the clustering
key.
> {noformat}
> user_id            | id                 | since                    | type
> -------------------+--------------------+--------------------------+------
> 116138050710536192 | 153047019424972800 |                     null |    0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+0000 |    2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 116138050710536192
AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 116138050710536192
AND id = 153047019424972800;
>  user_id            | id                 | since                    | type
> --------------------+--------------------+--------------------------+------
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+0000 |    2
> {noformat}
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message