incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Kabdebon <victor.kabde...@gmail.com>
Subject Re: revisioned data
Date Sat, 05 Feb 2011 22:38:28 GMT
Hello Raj,

No it actually doesn't make sense from the point of view of Cassandra;
OrderingPartioner preserves the order of the *keys*. The Ordering will be
done according to the *supercolumn name*. In that case you can set the
ordering with compare_super_with (sorry I don't remember exactly the new
term in Cassandra, but that's the idea). The compare_with will order your
columns inside your supercolumn.

However, and I think that many will agree here, tend to avoid SuperColumn.
Rather than using SuperColumns try to think like that :

CF1 : "ObjectStore"
Key :ID (long)
Columns : {
    name
    other fields
    update time (long [date])
    ...}

CF2 : "ObjectOrder"
Key : "myorderedobjects
Column:{
   { name : identifier that can be sorted
   value :ObjectID},
   ...
}

Best regards,
Victor Kabdebon,
http://www.voxnucleus.fr

2011/2/5 Raj Bakhru <rbakhru@gmail.com>

> Hi all -
>
> We're new to Cassandra and have read plenty on the data model, but we
> wanted to poll for thoughts on how to best handle this structure.
>
> We have simple objects that have and ID and we want to maintain a history
> of all the revisions.
>
> e.g.
> MyObject:
>     ID (long)
>     name
>     other fields
>     update time (long [date])
>
>
> Any time the object changes, we'll store down a new version of the object
> (same ID, but different update time and other fields).  We need to be able
> to query out what the object was as-of any time historically.  We also need
> to be able to query out what some or all of the items of this object type
> were as-of any time historically..
>
> In SQL, we'd just find the max(id) where update time < queried_as_of_time
>
> In Cassandra, we were thinking of modeling as follows:
>
> CF:  MyObjectType
> Super-Column: ID of object (e.g. 625)
> Column:  updatetime  (e.g. "1000245242")
> Value: byte[] of serialized object
>
> We were thinking of using the OrderingPartitioner and using range queries
> against the data.
>
> Does this make sense?  Are we approaching this in the wrong way?
>
> Thanks a lot
>
>
>
>

Mime
View raw message