incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Naryshkin <konstant...@a-bb.net>
Subject Re: One CF vs several CFs
Date Mon, 17 Oct 2011 13:45:49 GMT
Method 1 may also result in very wide rows if you have lots and lots of tags
and comments. This is a very drastic inefficiency for Cassadra (but again,
it depends on your data).

On Mon, Oct 17, 2011 at 05:40, Chintana Wilamuna <chintanaw@gmail.com>wrote:

> Hi,
>
> Does anyone have an idea about the pros/cons with modeling your data
> in the following way. First is you write all your data within a single
> CF. Using the infamous blog example,
>
> Posts = { // CF
>        slug-1: { // key to the row inside CF
>                title: "...",
>                body: "...",
>                tag1: "...",
>                tag2: "...",
>                ...
>                tagN: "...",
>                comment1: "...",
>                comment2: "...",
>                commentN: "..."
>        },
>
>        slug-2: {
>                title: "...",
>                body: "...",
>                tag1: "...",
>                tag2: "...",
>                ...
>                tagN: "...",
>                comment1: "...",
>                comment2: "...",
>                commentN: "..."
>        }
> }
>
> Using this model, one has to do slice queries to retrieve tags,
> comments for a given blog post but all tag, comment info for a
> particular post is available with a single read.
>
> The other method, breaking down tags and comments into their own CFs.
>
> Posts = { // CF
>        slug-1: { // key to the row inside CF
>                title: "...",
>                body: "..."
>        },
>
>        slug-2: {
>                title: "...",
>                body: "..."
>        }
> }
>
> Tags = { // CF
>        tag1: { // key to the row inside CF
>                timestamp1: slug-1,
>                timestamp2: slug-2
>        },
>
>        tag2: {
>                timestamp1: slug-2
>        }
> }
>
> Comments = { // CF
>        slug-1: { // key to the row inside CF
>                timestamp1: "comment1 ...",
>                timestamp2: "comment2 ...",
>                timestamp3: "comment3 ..."
>        },
>
>        slug-2: {
>                timestamp1: "comment1 ..."
>
>        }
> }
>
> Here, you have to do a couple of queries when you're trying to get
> tags and comments for a particular post.
>
> Does the answer is, it depends or is there an inherent inefficiency
> associated with method 1 regardless of the data you're trying to
> model?
>
> Thanks in advance,
>
>    -Chintana
>
> --
> blog: engwar.com/
> photos: flickr.com/photos/chintana
> linkedin: linkedin.com/in/engwar
> facebook: facebook.com/chintana
> twitter: twitter.com/std_err
>

Mime
View raw message