cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5527) Deletion by Secondary Key
Date Wed, 01 May 2013 09:16:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646477#comment-13646477
] 

Sylvain Lebresne commented on CASSANDRA-5527:
---------------------------------------------

I agree with Jonathan, I don't see how we could make that efficient without having to read
all the secondary key tombstones each time you read the row, which doesn't sound fun.

But as an aside, I'll note that another option for this is to use a secondary index. Now I
know it's not read-free, but provided you do provide the partition key in the query, this
will not be horribly inefficient either. And you'll exchange slightly slower writes for no
hit whatsoever on reads, which I would suspect is a better trade-off more often than not for
that kind of operation.
                
> Deletion by Secondary Key
> -------------------------
>
>                 Key: CASSANDRA-5527
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5527
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Rick Branson
>
> Given Cassandra's popularity as a time ordered list store, the inability to do deletes
by anything other than the primary key without re-implementing tombstones in the application
is a bit of an achilles heel for many use cases. It's a data modeling problem that seems to
come up quite often, and given that we now have the CQL3 abstraction layer sitting on top
of the storage engine, I think there's an opportunity to take this burden off of the application
layer. I've spent several weeks thinking about this problem within the context of Cassandra,
and I think I've come up with a reasonable proposal.
> It would involve addition of a secondary key facility to CQL3 tables:
> CREATE TABLE timeline (
> 	timeline_id uuid,
> 	entry_id timeuuid,
>  	entry_key blob,
> 	entry_payload blob,
> 	PRIMARY KEY (timeline_id, entry_id),
> 	KEY (timeline_id, entry_key)
> );
> Secondary keys would be required to share the same partition key with the primary key.
They would be included to support deletion by secondary key operations:
> DELETE FROM timeline WHERE timeline_id = <X> and entry_key = <Y>;
> Underneath, the storage engine row would contain additional secondary key tombstones.
Secondary key deletion would be read-free, requiring a single tombstone write. The cost of
reads would necessarily go up. Queries would need to be modified to perform an additional
step to find any matching secondary key tombstones and perform the regular convergence process.
The secondary key tombstones should be cleaned up by the regular tombstone GC process.
> While I didn't want to complicate this idea too much, it might be also worth having a
discussion around supporting secondary key queries as well, or at least making the schema
compatible with potential future support (maybe rename KEY to DELETABLE KEY or something).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message