Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 2560 invoked from network); 5 Feb 2011 22:39:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Feb 2011 22:39:19 -0000 Received: (qmail 94861 invoked by uid 500); 5 Feb 2011 22:39:17 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 94797 invoked by uid 500); 5 Feb 2011 22:39:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 94789 invoked by uid 99); 5 Feb 2011 22:39:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Feb 2011 22:39:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of victor.kabdebon@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Feb 2011 22:39:09 +0000 Received: by fxm9 with SMTP id 9so3888050fxm.31 for ; Sat, 05 Feb 2011 14:38:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=a5vXJ2qztUYAs8YRMcKOAcg99eGRXbiHxOHisZZyDrc=; b=jslNS/pKrb6kuZSs7e39fA2IiAurbETNhHo1cfvxDTCxFQRb/S88QF98EOcSVBfqOt lXwlTf/BppFw1t2SIoD2Pfv8gROeON7YlGCr0wd7IbXPFv07yGKxwNtpK6Z+TFDMM9qB d2ftKBONyUPczmiymMWpxXOiLBIUZQEB4fdOM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=E+cPAAVEN5hCfSqMiI8HK4oXwH44ssXNyOL6WHV3QPSTxwY/0TRVFq1mkLPwWqRqQh 5CJrelujId1zeHy3T7Qom/iHBjf7T4+1oSNqBUhw65jtm39pUpLN95x1LqPVULn/WTwO Czrt31wyP7HaZjDXnM2Wub8bCSJkotBHBe7DQ= Received: by 10.223.101.206 with SMTP id d14mr9040381fao.134.1296945528732; Sat, 05 Feb 2011 14:38:48 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.107.197 with HTTP; Sat, 5 Feb 2011 14:38:28 -0800 (PST) In-Reply-To: References: From: Victor Kabdebon Date: Sat, 5 Feb 2011 17:38:28 -0500 Message-ID: Subject: Re: revisioned data To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf3054a50788862b049b90a9a5 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3054a50788862b049b90a9a5 Content-Type: text/plain; charset=ISO-8859-1 Hello Raj, No it actually doesn't make sense from the point of view of Cassandra; OrderingPartioner preserves the order of the *keys*. The Ordering will be done according to the *supercolumn name*. In that case you can set the ordering with compare_super_with (sorry I don't remember exactly the new term in Cassandra, but that's the idea). The compare_with will order your columns inside your supercolumn. However, and I think that many will agree here, tend to avoid SuperColumn. Rather than using SuperColumns try to think like that : CF1 : "ObjectStore" Key :ID (long) Columns : { name other fields update time (long [date]) ...} CF2 : "ObjectOrder" Key : "myorderedobjects Column:{ { name : identifier that can be sorted value :ObjectID}, ... } Best regards, Victor Kabdebon, http://www.voxnucleus.fr 2011/2/5 Raj Bakhru > Hi all - > > We're new to Cassandra and have read plenty on the data model, but we > wanted to poll for thoughts on how to best handle this structure. > > We have simple objects that have and ID and we want to maintain a history > of all the revisions. > > e.g. > MyObject: > ID (long) > name > other fields > update time (long [date]) > > > Any time the object changes, we'll store down a new version of the object > (same ID, but different update time and other fields). We need to be able > to query out what the object was as-of any time historically. We also need > to be able to query out what some or all of the items of this object type > were as-of any time historically.. > > In SQL, we'd just find the max(id) where update time < queried_as_of_time > > In Cassandra, we were thinking of modeling as follows: > > CF: MyObjectType > Super-Column: ID of object (e.g. 625) > Column: updatetime (e.g. "1000245242") > Value: byte[] of serialized object > > We were thinking of using the OrderingPartitioner and using range queries > against the data. > > Does this make sense? Are we approaching this in the wrong way? > > Thanks a lot > > > > --20cf3054a50788862b049b90a9a5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello Raj,

No it actually doesn't make sense from the point of v= iew of Cassandra;
OrderingPartioner preserves the order of the keys. The Ordering will be done according to the supercolumn name. In = that case you can set the ordering with compare_super_with (sorry I don'= ;t remember exactly the new term in Cassandra, but that's the idea). Th= e compare_with will order your columns inside your supercolumn.

However, and I think that many will agree here, tend to avoid SuperColu= mn. Rather than using SuperColumns try to think like that :

CF1 : "ObjectStore"
Key :ID (long)Columns : {
=A0=A0=A0 name=A0=A0=A0 other fields
=A0=A0=A0 update time (= long [date])
=A0=A0=A0 ...}

CF2 : "ObjectOrder"
Key : "myorderedobjects
Column:{
=A0=A0 { name : identifier that can be sorted
=A0=A0 value :ObjectID}= ,
=A0=A0 ...
}

Best re= gards,
Victor Kabdebon,
http://w= ww.voxnucleus.fr

2011/2/5 Raj Bakhru = <rbakhru@gmail.co= m>
Hi all -

W= e're new to Cassandra and have read plenty on the data model, but we wa= nted to poll for thoughts on how to best handle this structure.

We have simple objects that have and ID and we want to maintain a histo= ry of all the revisions.

e.g.
MyObject:
=A0=A0=A0 ID (long)
=A0=A0=A0 name
=A0=A0=A0= other fields
=A0=A0=A0 update time (long [date])


Any time th= e object changes, we'll store down a new version of the object (same ID= , but different update time and other fields).=A0 We need to be able to que= ry out what the object was as-of any time historically.=A0 We also need to = be able to query out what some or all of the items of this object type were= as-of any time historically..

In SQL, we'd just find the max(id) where update time < queried_a= s_of_time

In Cassandra, we were thinking of modeling as follows:
=
CF:=A0 MyObjectType
Super-Column: ID of object (e.g. 625)
Column:= =A0 updatetime=A0 (e.g. "1000245242")
Value: byte[] of serialized object

We were thinking of using the Ord= eringPartitioner and using range queries against the data.=A0

Does = this make sense?=A0 Are we approaching this in the wrong way?

Thanks= a lot




--20cf3054a50788862b049b90a9a5--