incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Jeske <dav...@gmail.com>
Subject Re: Not all data structures need timestamps (and don't require wasted memory).
Date Mon, 12 Sep 2011 17:53:12 GMT
After writing my message, I recognized a scenerio you might be referring to
Kevin.

If I understand correctly, you're not referring to set-membership in the
general sense, where one could add and remove entries. General
set-membership, in the context of eventual-consistency, requires timestamps.
The timestamps distinguish between the two values "present" and
"not-present". (not-present being represented by timestamped tombstones in
the case of deletion/removal).

So I suppose you're referring to "additive-only set membership", where there
is no need to distinguish between two different states (such as present or
not present in a set), because items can only be added, never changed or
removed. If entries are not allowed to be deleted or modified, then
cassandra-style eventual consistency replication could occur without any
timestamp, because you're simply replicating the existence of keys to all
replicas.

To me this seems a particularly narrow use-case. Any inadvertant write (even
one from a bug or data-corruption), would require very frustrating manual
intervention to remove. (you'd have to manually shutdown all nodes, manually
purge bad values out of the dataset, then bring the nodes back online) I'm
not a cassandra developer, but this seems like a path which is very
specialized and not very in-line with Cassandra's design.

You might have better luck with a distributed store that is not based on
timestamp eventual consistency. I don't know if you can explicitly turn off
timestamps in HBase, but AFAIK the client is allowed to supply them, so you
can just supply zero and they should be compressed out quite well.

Mime
View raw message