incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: One CF vs several CFs
Date Mon, 17 Oct 2011 20:54:44 GMT
It depends on what your workload is and how you want to read the data. 

If you want to get all the data for an article every time, and the number of comments is not
huge go with option 1.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 18/10/2011, at 2:45 AM, Konstantin Naryshkin wrote:

> Method 1 may also result in very wide rows if you have lots and lots of tags and comments.
This is a very drastic inefficiency for Cassadra (but again, it depends on your data).
> 
> On Mon, Oct 17, 2011 at 05:40, Chintana Wilamuna <chintanaw@gmail.com> wrote:
> Hi,
> 
> Does anyone have an idea about the pros/cons with modeling your data
> in the following way. First is you write all your data within a single
> CF. Using the infamous blog example,
> 
> Posts = { // CF
>        slug-1: { // key to the row inside CF
>                title: "...",
>                body: "...",
>                tag1: "...",
>                tag2: "...",
>                ...
>                tagN: "...",
>                comment1: "...",
>                comment2: "...",
>                commentN: "..."
>        },
> 
>        slug-2: {
>                title: "...",
>                body: "...",
>                tag1: "...",
>                tag2: "...",
>                ...
>                tagN: "...",
>                comment1: "...",
>                comment2: "...",
>                commentN: "..."
>        }
> }
> 
> Using this model, one has to do slice queries to retrieve tags,
> comments for a given blog post but all tag, comment info for a
> particular post is available with a single read.
> 
> The other method, breaking down tags and comments into their own CFs.
> 
> Posts = { // CF
>        slug-1: { // key to the row inside CF
>                title: "...",
>                body: "..."
>        },
> 
>        slug-2: {
>                title: "...",
>                body: "..."
>        }
> }
> 
> Tags = { // CF
>        tag1: { // key to the row inside CF
>                timestamp1: slug-1,
>                timestamp2: slug-2
>        },
> 
>        tag2: {
>                timestamp1: slug-2
>        }
> }
> 
> Comments = { // CF
>        slug-1: { // key to the row inside CF
>                timestamp1: "comment1 ...",
>                timestamp2: "comment2 ...",
>                timestamp3: "comment3 ..."
>        },
> 
>        slug-2: {
>                timestamp1: "comment1 ..."
> 
>        }
> }
> 
> Here, you have to do a couple of queries when you're trying to get
> tags and comments for a particular post.
> 
> Does the answer is, it depends or is there an inherent inefficiency
> associated with method 1 regardless of the data you're trying to
> model?
> 
> Thanks in advance,
> 
>    -Chintana
> 
> --
> blog: engwar.com/
> photos: flickr.com/photos/chintana
> linkedin: linkedin.com/in/engwar
> facebook: facebook.com/chintana
> twitter: twitter.com/std_err
> 


Mime
View raw message