Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 58963 invoked from network); 6 Feb 2011 04:51:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Feb 2011 04:51:52 -0000 Received: (qmail 32617 invoked by uid 500); 6 Feb 2011 04:51:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 32273 invoked by uid 500); 6 Feb 2011 04:51:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32265 invoked by uid 99); 6 Feb 2011 04:51:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Feb 2011 04:51:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.210.44 as permitted sender) Received: from [209.85.210.44] (HELO mail-pz0-f44.google.com) (209.85.210.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Feb 2011 04:51:41 +0000 Received: by pzk5 with SMTP id 5so842067pzk.31 for ; Sat, 05 Feb 2011 20:51:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type:content-transfer-encoding; bh=VOBLaYgYp//tv9agB+dlT1F0vOKNUJxagQh59jhkAq8=; b=KzYklYX8VVrGHhdpv/hq6oCUnILhzjXtfcUs96GXZDaFm3SB4NufsqgyMa2Di0Ur9k 5uX27uVSmBIWTz0eYsuWxmJe31a0ryvGDMXNqEU/y+ERBMAihwBOAV4kGqKfbHHUgzdY 7+Hd0Xf8xV2E5Pf0taS51VwB5EYXgR+USzAb8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=XDz7uUS5GRNpRyhmeVwJ1YhtWsZVPsPXTlrNUIERqRhg2QldhE6soDWf5uUzN9yifF Iuh94GgqluwdkHUsxBLbqqcMmHCpOMGLtddQ9E5sOUZWZILW+pbdNNaNQSSbSkGqnY9k SNYYSOOJzyudEzakcZ0dn4ktvTCF4KSm9GGQM= Received: by 10.142.161.11 with SMTP id j11mr13749534wfe.60.1296967880232; Sat, 05 Feb 2011 20:51:20 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.164.19 with HTTP; Sat, 5 Feb 2011 20:51:00 -0800 (PST) In-Reply-To: References: From: Jonathan Ellis Date: Sat, 5 Feb 2011 22:51:00 -0600 Message-ID: Subject: Re: revisioned data To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Using supercolumns to contain versions is reasonable, as long as the number of versions is not too large. On Sat, Feb 5, 2011 at 4:38 PM, Victor Kabdebon wrote: > Hello Raj, > > No it actually doesn't make sense from the point of view of Cassandra; > OrderingPartioner preserves the order of the keys. The Ordering will be d= one > according to the supercolumn name. In that case you can set the ordering > with compare_super_with (sorry I don't remember exactly the new term in > Cassandra, but that's the idea). The compare_with will order your columns > inside your supercolumn. > > However, and I think that many will agree here, tend to avoid SuperColumn= . > Rather than using SuperColumns try to think like that : > > CF1 : "ObjectStore" > Key :ID (long) > Columns : { > =A0=A0=A0 name > =A0=A0=A0 other fields > =A0=A0=A0 update time (long [date]) > =A0=A0=A0 ...} > > CF2 : "ObjectOrder" > Key : "myorderedobjects > Column:{ > =A0=A0 { name : identifier that can be sorted > =A0=A0 value :ObjectID}, > =A0=A0 ... > } > > Best regards, > Victor Kabdebon, > http://www.voxnucleus.fr > > 2011/2/5 Raj Bakhru >> >> Hi all - >> >> We're new to Cassandra and have read plenty on the data model, but we >> wanted to poll for thoughts on how to best handle this structure. >> >> We have simple objects that have and ID and we want to maintain a histor= y >> of all the revisions. >> >> e.g. >> MyObject: >> =A0=A0=A0 ID (long) >> =A0=A0=A0 name >> =A0=A0=A0 other fields >> =A0=A0=A0 update time (long [date]) >> >> >> Any time the object changes, we'll store down a new version of the objec= t >> (same ID, but different update time and other fields).=A0 We need to be = able >> to query out what the object was as-of any time historically.=A0 We also= need >> to be able to query out what some or all of the items of this object typ= e >> were as-of any time historically.. >> >> In SQL, we'd just find the max(id) where update time < queried_as_of_tim= e >> >> In Cassandra, we were thinking of modeling as follows: >> >> CF:=A0 MyObjectType >> Super-Column: ID of object (e.g. 625) >> Column:=A0 updatetime=A0 (e.g. "1000245242") >> Value: byte[] of serialized object >> >> We were thinking of using the OrderingPartitioner and using range querie= s >> against the data. >> >> Does this make sense?=A0 Are we approaching this in the wrong way? >> >> Thanks a lot >> >> >> > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com