cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kelvin Kakugawa (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-580) vector clock support
Date Mon, 07 Dec 2009 20:33:18 GMT


Kelvin Kakugawa commented on CASSANDRA-580:

I've been talking w/ the authors of the interval tree clocks (ITC) paper about how to apply
ITC to Cassandra, and it looks like we may need to modify the ITC algorithm for our use-case.

The crux of the matter is Cassandra's hinted hand-off feature.  The ITC algorithm composes
an id-tree and event-tree to represent the version of a given value.  The id-tree is a nice
way to create unique ids on-the-fly for any node (by splitting the id-tree, as necessary)
and the event-tree represents causality.  However, the problem is that for a node to update
the event-tree for a value, it has to be assigned a part of the id-tree beforehand.

A short example, follows:
If a node tries to forward a value, but (because of failure scenarios) it has to store the
value, locally.  It wouldn't be able to update the version of the value, unless it had been
assigned a part of the id-tree beforehand from the set of nodes responsible for the value.

The authors have a couple of solutions:
1) Split the id-tree between all nodes in the cluster from the very start.  This solves the
problem, but it does mute the attractive benefits of ITC over traditional version vectors.
 i.e. dynamically partitioning the id space at run-time and only to the extent necessary to
conserve space.
2) On client reads, doing a "fork" instead of a "peek" and sharing the id-tree w/ the client.
 However, this is a more complicated approach that may need to be worked out some more.

In any case, since we're using an opaque context, these decisions won't affect the interface.
 However, it's an interesting implementation concern.  Depending on the average size of a
Cassandra cluster, it may or may not be worth pre-forking the id-tree to all nodes from the
very start.

> vector clock support
> --------------------
>                 Key: CASSANDRA-580
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff,
580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>   Original Estimate: 672h
>  Remaining Estimate: 672h
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps. 
Purpose: enable incr/decr; flexible conflict resolution.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message