happens in Cassandra with your scenario is the
insert new record
-> the record is added to Cassandra's dataset (with the given
-> a tombstone is added to the data set (with the timestamp of the
which should be larger than the
timestamp in 1), otherwise, the delete
will be lost.
insert new record with same key as deleted record
-> the record is added as in 2), but the timestamp should be larger
the timestamps from both 1) and
compact between 2) and 3), the record inserted at 1) will be
but the tombstone from 2) will not be thrown away *unless* the
created more than GCGraceSeconds (a configuration option) before
do not compact, all records and tombstone will be present in
and each read operation checks which of the records has the
timestamp before returning the most current record (or report an
error, if the tombstone
whether you compact or not does not make a difference for your
as all replicas see the tombstone before GCGraceSeconds have
is the case, it is possible that deleted records come alive again,
tombstones are deleted before all replicas had a chance to remove
question about concurrently inserting the same key from different
another beast. The simple answer is: don't do it.
longer answer: either you use some external synchronisation
Zookeeper), or you make sure that all clients use disjoint keys (UUIDs,
derived from the clients IP address+timestamp, that sort of
representing user accounts or something similar, I would
external synchronisation mechanism, because for actions like
registration latency caused by such a mechanism is usually not a
coming in quickly, where the overhead of synchronisation is not
UUID variant and reconcile the data on read.
I've been developing a system against cassandra over the last few
weeks, and I'd like to ask the community some advice on the best way to
deal with inserting new data where the key is currently a tombstone
As with all distributed systems, this is always a tricky thing to
deal with, so I though I'd throw it to a wider audience.
1) insert new record.
2) deleted record.
3) insert record with same key as deleted record.
Now I know I can make this work if I flush and compact between 2 and
3. However, I don't want to rely on a flush and compact and I'd like to
code defensively against this senario, and I've ended up looking up to see
if the key exists, then if it does then I know I can't insert the data.
However, if the key does not exist then I attempt an insert.
Now, here lies the issue. If I have more than one client doing this
at the same time, both trying to insert using the same key. One will
succeed and ones will fail. However neither insert will give me an
indication of which one actually succeeded.
So should an insert against an existing key, or deleted key produce
some kind of exception ?