cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Wang (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-2305) Tombstoned rows not purged from cache after gcgraceseconds
Date Fri, 11 Mar 2011 20:57:59 GMT


Jeffrey Wang commented on CASSANDRA-2305:

I actually don't have row cache enabled (I just checked cfstats to make sure), so I don't
think that's the cause of my problem in particular. Here's some more info that may or may
not be correct:

- When I run the compaction, in ColumnFamilyStore.removeDeletedStandard() I see that columns
are being removed because of the c.timestamp() <= cf.getMarkedForDeleteAt() condition,
which makes sense since I issued a delete on the entire row.
- However, after the compaction, I do the insert, and if I flush/compact again, I still see
the columns being removed because of that condition. It seems like the markedForDeleteAt field
on the ColumnFamily is persisting across the major compaction which I believe is hiding the
newly inserted column.

Also, my initial steps to repro were not correct, which made it hard to figure out the root
cause. Here is a proper repro:

- Create a CF with gc_grace_seconds = 0 and no row cache.
- Insert row X, col A with timestamp 0.
- Insert row X, col B with timestamp 2.
- Remove row X with timestamp 1 (expect col A to disappear, col B to stay).
- Wait 1 second.
- Force flush and compaction.
- Insert row X, col A with timestamp 0.
- Read row X, col A (see nothing).

Inserting row X, col B is necessary for this to repro because if all the columns in a row
disappear, the ColumnFamily object goes away and the markedForDeleteAt field is reset. Only
when a column still exists does the field persist across the compaction. Hope this helps!

> Tombstoned rows not purged from cache after gcgraceseconds
> ----------------------------------------------------------
>                 Key: CASSANDRA-2305
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jeffrey Wang
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 0.7.4
>         Attachments: 0001-Compaction-test.patch, 0002-Invalidate-row-cache-on-compaction-purge.patch
>   Original Estimate: 2h
>          Time Spent: 2h
>  Remaining Estimate: 0h
> From email to list:
> I was wondering if this is the expected behavior of deletes (0.7.0). Let's say I have
a 1-node cluster with a single CF which has gc_grace_seconds = 0. The following sequence of
operations happens (in the given order):
> insert row X with timestamp T
> delete row X with timestamp T+1
> force flush + compaction
> insert row X with timestamp T
> My understanding is that the tombstone created by the delete (and row X) will disappear
with the flush + compaction which means the last insertion should show up. My experimentation,
however, suggests otherwise (the last insertion does not show up).
> I believe I have traced this to the fact that the markedForDeleteAt field on the ColumnFamily
does not get reset after a compaction (after gc_grace_seconds has passed); is this desirable?
I think it introduces an inconsistency in how tombstoned columns work versus tombstoned CFs.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message