Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of victor.kabdebon@gmail.com
 designates 209.85.161.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type;
        b=Zj0Dlq89KKPBGTGkaMzn16yBYJOKlFKRviwqxkIBaLMlsEid3mXECTlDtDGv57Jfw7
         VRSIiL3ch8wb+3uwHdWeHrRl6HluU/Xxi0OFh75mY4f6Jqclh3m9ZN2U4aaEAKCSz3ro
         Egep20xJqkNYg6I32i1ZXOYv41OuNJMF65Z3U=
MIME-Version: 1.0
In-Reply-To: <BANLkTikZm38zv9GhmGLGautr_fp8+yJBBQ@mail.gmail.com>
References: <BANLkTi=2+0mJG31ARgWepvp6D63k1X9KxQ@mail.gmail.com>
 <BANLkTikZm38zv9GhmGLGautr_fp8+yJBBQ@mail.gmail.com>
From: Victor Kabdebon <victor.kabdebon@gmail.com>
Date: Wed, 13 Apr 2011 11:23:58 -0400
Message-ID: <BANLkTinnovPbUqTpGxRDcw8kj0fo_6LvkQ@mail.gmail.com>
Subject: Re: database design
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=00151747bdd0f8444a04a0ce6636

--00151747bdd0f8444a04a0ce6636
Content-Type: text/plain; charset=ISO-8859-1

Dear Jean-Yves,

You can have a different approach of the problem.
You need on one side a relational database (MySQL, PostGreSQL) or SolR (as
an very efficient index) and on the other side Cassandra. The relational
database or SolR must contain the minimum amount of information possible : a
date and only the relevant data. It enabled me to keep a simple model for
Cassandra.
Cassandra will act as a "vault" where you keep all the data and then you
dispatch the data from Cassandra to the relational database or SolR. When
you want to query you query against SolR or the relational data the key /
column / supercolumn and you retrieve the complete data from Cassandra. The
hard thing is to maintain the coherence between the query part and the
Cassandra part.
I speak from personal experience but it was very hard for me to use only
Cassandra to do everything my (small amateur) website needed. Now I found an
alternative I use : Cassandra (data vault) + Redis (Sessions and other
volatile data) + SolR (Search engine) + PostGreSQL ( for relational
queries).

Best regards,
Victor Kabdebon
http://www.voxnucleus.fr

2011/4/13 Edward Capriolo <edlinuxguru@gmail.com>

> On Wed, Apr 13, 2011 at 10:39 AM, Jean-Yves LEBLEU <jlebleu@gmail.com>
> wrote:
> > Hi all,
> >
> > Just some thoughts and question I have about cassandra data modeling.
> >
> > If I understand well, cassandra is better on writing than on reading.
> > So you have to think about your queries to design cassandra schema. We
> > are doing incremental design, and already have our system in
> > production and we have to develop new queries.
> > How do you usualy do when you have new queries, do you write a
> > specific job to update data in the database to match the new query you
> > are writing ?
> >
> > Thanks for your help.
> >
> > Jean-Yves
> >
>
> Good point, Generally you will need to write some type of range
> scanning/map reduce application to process and back fill your data.
>

--00151747bdd0f8444a04a0ce6636
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Dear Jean-Yves,<br><br>You can have a different approach of the problem.<br=
>You need on one side a relational database (MySQL, PostGreSQL) or SolR (as=
 an very efficient index) and on the other side Cassandra. The relational d=
atabase or SolR must contain the minimum amount of information possible : a=
 date and only the relevant data. It enabled me to keep a simple model for =
Cassandra.<br>


Cassandra will act as a &quot;vault&quot; where you keep all the data and t=
hen you dispatch the data from Cassandra to the relational database or SolR=
. When you want to query you query against SolR or the relational data the =
key / column / supercolumn and you retrieve the complete data from Cassandr=
a. The hard thing is to maintain the coherence between the query part and t=
he Cassandra part.<br>

I speak from personal experience but it was very hard for me to use only Ca=
ssandra to do everything my (small amateur) website needed. Now I found an =
alternative I use : Cassandra (data vault) + Redis (Sessions and other vola=
tile data) + SolR (Search engine) + PostGreSQL ( for relational queries).<b=
r>

<br>Best regards,<br>Victor Kabdebon<br><a href=3D"http://www.voxnucleus.fr=
">http://www.voxnucleus.fr</a><br><br><div class=3D"gmail_quote">2011/4/13 =
Edward Capriolo <span dir=3D"ltr">&lt;<a href=3D"mailto:edlinuxguru@gmail.c=
om" target=3D"_blank">edlinuxguru@gmail.com</a>&gt;</span><br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div><div></div><div>On Wed, Apr 13, 2011 at=
 10:39 AM, Jean-Yves LEBLEU &lt;<a href=3D"mailto:jlebleu@gmail.com" target=
=3D"_blank">jlebleu@gmail.com</a>&gt; wrote:<br>


&gt; Hi all,<br>
&gt;<br>
&gt; Just some thoughts and question I have about cassandra data modeling.<=
br>
&gt;<br>
&gt; If I understand well, cassandra is better on writing than on reading.<=
br>
&gt; So you have to think about your queries to design cassandra schema. We=
<br>
&gt; are doing incremental design, and already have our system in<br>
&gt; production and we have to develop new queries.<br>
&gt; How do you usualy do when you have new queries, do you write a<br>
&gt; specific job to update data in the database to match the new query you=
<br>
&gt; are writing ?<br>
&gt;<br>
&gt; Thanks for your help.<br>
&gt;<br>
&gt; Jean-Yves<br>
&gt;<br>
<br>
</div></div>Good point, Generally you will need to write some type of range=
<br>
scanning/map reduce application to process and back fill your data.<br>
</blockquote></div><br>

--00151747bdd0f8444a04a0ce6636--