Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of victor.kabdebon@gmail.com
 designates 209.85.161.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type;
        b=E+cPAAVEN5hCfSqMiI8HK4oXwH44ssXNyOL6WHV3QPSTxwY/0TRVFq1mkLPwWqRqQh
         5CJrelujId1zeHy3T7Qom/iHBjf7T4+1oSNqBUhw65jtm39pUpLN95x1LqPVULn/WTwO
         Czrt31wyP7HaZjDXnM2Wub8bCSJkotBHBe7DQ=
MIME-Version: 1.0
In-Reply-To: <AANLkTi=gEqCcWi3GZNJOuJO1yutXguzTjgaMcep0mSyj@mail.gmail.com>
References: <AANLkTi=gEqCcWi3GZNJOuJO1yutXguzTjgaMcep0mSyj@mail.gmail.com>
From: Victor Kabdebon <victor.kabdebon@gmail.com>
Date: Sat, 5 Feb 2011 17:38:28 -0500
Message-ID: <AANLkTi=kOa2T=m0i0C_Krr6p59NJJU_GJ_VmG0KZC6ap@mail.gmail.com>
Subject: Re: revisioned data
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf3054a50788862b049b90a9a5

--20cf3054a50788862b049b90a9a5
Content-Type: text/plain; charset=ISO-8859-1

Hello Raj,

No it actually doesn't make sense from the point of view of Cassandra;
OrderingPartioner preserves the order of the *keys*. The Ordering will be
done according to the *supercolumn name*. In that case you can set the
ordering with compare_super_with (sorry I don't remember exactly the new
term in Cassandra, but that's the idea). The compare_with will order your
columns inside your supercolumn.

However, and I think that many will agree here, tend to avoid SuperColumn.
Rather than using SuperColumns try to think like that :

CF1 : "ObjectStore"
Key :ID (long)
Columns : {
    name
    other fields
    update time (long [date])
    ...}

CF2 : "ObjectOrder"
Key : "myorderedobjects
Column:{
   { name : identifier that can be sorted
   value :ObjectID},
   ...
}

Best regards,
Victor Kabdebon,
http://www.voxnucleus.fr

2011/2/5 Raj Bakhru <rbakhru@gmail.com>

> Hi all -
>
> We're new to Cassandra and have read plenty on the data model, but we
> wanted to poll for thoughts on how to best handle this structure.
>
> We have simple objects that have and ID and we want to maintain a history
> of all the revisions.
>
> e.g.
> MyObject:
>     ID (long)
>     name
>     other fields
>     update time (long [date])
>
>
> Any time the object changes, we'll store down a new version of the object
> (same ID, but different update time and other fields).  We need to be able
> to query out what the object was as-of any time historically.  We also need
> to be able to query out what some or all of the items of this object type
> were as-of any time historically..
>
> In SQL, we'd just find the max(id) where update time < queried_as_of_time
>
> In Cassandra, we were thinking of modeling as follows:
>
> CF:  MyObjectType
> Super-Column: ID of object (e.g. 625)
> Column:  updatetime  (e.g. "1000245242")
> Value: byte[] of serialized object
>
> We were thinking of using the OrderingPartitioner and using range queries
> against the data.
>
> Does this make sense?  Are we approaching this in the wrong way?
>
> Thanks a lot
>
>
>
>

--20cf3054a50788862b049b90a9a5
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hello Raj,<br><br>No it actually doesn&#39;t make sense from the point of v=
iew of Cassandra;<br>OrderingPartioner preserves the order of the <b>keys</=
b>. The Ordering will be done according to the <b>supercolumn name</b>. In =
that case you can set the ordering with compare_super_with (sorry I don&#39=
;t remember exactly the new term in Cassandra, but that&#39;s the idea). Th=
e compare_with will order your columns inside your supercolumn.<br>

<br>However, and I think that many will agree here, tend to avoid SuperColu=
mn. Rather than using SuperColumns try to think like that :<br><br><span st=
yle=3D"font-family: courier new,monospace;">CF1 : &quot;ObjectStore&quot;</=
span><br style=3D"font-family: courier new,monospace;">

<span style=3D"font-family: courier new,monospace;">Key :ID (long)</span><b=
r style=3D"font-family: courier new,monospace;"><span style=3D"font-family:=
 courier new,monospace;">Columns : {</span><br style=3D"font-family: courie=
r new,monospace;">

<span style=3D"font-family: courier new,monospace;">=A0=A0=A0 name</span><b=
r style=3D"font-family: courier new,monospace;"><span style=3D"font-family:=
 courier new,monospace;">=A0=A0=A0 other fields</span><br style=3D"font-fam=
ily: courier new,monospace;">

<span style=3D"font-family: courier new,monospace;">=A0=A0=A0 update time (=
long [date])</span><br style=3D"font-family: courier new,monospace;"><span =
style=3D"font-family: courier new,monospace;">=A0=A0=A0 ...}</span><br styl=
e=3D"font-family: courier new,monospace;">

<br style=3D"font-family: courier new,monospace;"><span style=3D"font-famil=
y: courier new,monospace;">CF2 : &quot;ObjectOrder&quot;</span><br style=3D=
"font-family: courier new,monospace;"><span style=3D"font-family: courier n=
ew,monospace;">Key : &quot;myorderedobjects</span><br style=3D"font-family:=
 courier new,monospace;">

<span style=3D"font-family: courier new,monospace;">Column:{</span><br styl=
e=3D"font-family: courier new,monospace;"><span style=3D"font-family: couri=
er new,monospace;">=A0=A0 { name : identifier that can be sorted</span><br =
style=3D"font-family: courier new,monospace;">

<span style=3D"font-family: courier new,monospace;">=A0=A0 value :ObjectID}=
,</span><br style=3D"font-family: courier new,monospace;"><span style=3D"fo=
nt-family: courier new,monospace;">=A0=A0 ...</span><br style=3D"font-famil=
y: courier new,monospace;">

<span style=3D"font-family: courier new,monospace;">}</span><br><br>Best re=
gards,<br>Victor Kabdebon,<br><a href=3D"http://www.voxnucleus.fr">http://w=
ww.voxnucleus.fr</a><br><br><div class=3D"gmail_quote">2011/2/5 Raj Bakhru =
<span dir=3D"ltr">&lt;<a href=3D"mailto:rbakhru@gmail.com">rbakhru@gmail.co=
m</a>&gt;</span><br>

<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hi all -<br><br>W=
e&#39;re new to Cassandra and have read plenty on the data model, but we wa=
nted to poll for thoughts on how to best handle this structure.<br>

<br>We have simple objects that have and ID and we want to maintain a histo=
ry of all the revisions.<br>

<br>e.g.<br>MyObject:<br>=A0=A0=A0 ID (long)<br>=A0=A0=A0 name<br>=A0=A0=A0=
 other fields<br>=A0=A0=A0 update time (long [date])<br><br><br>Any time th=
e object changes, we&#39;ll store down a new version of the object (same ID=
, but different update time and other fields).=A0 We need to be able to que=
ry out what the object was as-of any time historically.=A0 We also need to =
be able to query out what some or all of the items of this object type were=
 as-of any time historically..<br>


<br>In SQL, we&#39;d just find the max(id) where update time &lt; queried_a=
s_of_time<br><br>In Cassandra, we were thinking of modeling as follows:<br>=
<br>CF:=A0 MyObjectType<br>Super-Column: ID of object (e.g. 625)<br>Column:=
=A0 updatetime=A0 (e.g. &quot;1000245242&quot;)<br>


Value: byte[] of serialized object<br><br>We were thinking of using the Ord=
eringPartitioner and using range queries against the data.=A0 <br><br>Does =
this make sense?=A0 Are we approaching this in the wrong way?<br><br>Thanks=
 a lot<br>


<br><br><br>
</blockquote></div><br>

--20cf3054a50788862b049b90a9a5--