Method 1 may also result in very wide rows if you have lots and lots of tags
and comments. This is a very drastic inefficiency for Cassadra (but again,
it depends on your data).
On Mon, Oct 17, 2011 at 05:40, Chintana Wilamuna <chintanaw@gmail.com>wrote:
> Hi,
>
> Does anyone have an idea about the pros/cons with modeling your data
> in the following way. First is you write all your data within a single
> CF. Using the infamous blog example,
>
> Posts = { // CF
> slug-1: { // key to the row inside CF
> title: "...",
> body: "...",
> tag1: "...",
> tag2: "...",
> ...
> tagN: "...",
> comment1: "...",
> comment2: "...",
> commentN: "..."
> },
>
> slug-2: {
> title: "...",
> body: "...",
> tag1: "...",
> tag2: "...",
> ...
> tagN: "...",
> comment1: "...",
> comment2: "...",
> commentN: "..."
> }
> }
>
> Using this model, one has to do slice queries to retrieve tags,
> comments for a given blog post but all tag, comment info for a
> particular post is available with a single read.
>
> The other method, breaking down tags and comments into their own CFs.
>
> Posts = { // CF
> slug-1: { // key to the row inside CF
> title: "...",
> body: "..."
> },
>
> slug-2: {
> title: "...",
> body: "..."
> }
> }
>
> Tags = { // CF
> tag1: { // key to the row inside CF
> timestamp1: slug-1,
> timestamp2: slug-2
> },
>
> tag2: {
> timestamp1: slug-2
> }
> }
>
> Comments = { // CF
> slug-1: { // key to the row inside CF
> timestamp1: "comment1 ...",
> timestamp2: "comment2 ...",
> timestamp3: "comment3 ..."
> },
>
> slug-2: {
> timestamp1: "comment1 ..."
>
> }
> }
>
> Here, you have to do a couple of queries when you're trying to get
> tags and comments for a particular post.
>
> Does the answer is, it depends or is there an inherent inefficiency
> associated with method 1 regardless of the data you're trying to
> model?
>
> Thanks in advance,
>
> -Chintana
>
> --
> blog: engwar.com/
> photos: flickr.com/photos/chintana
> linkedin: linkedin.com/in/engwar
> facebook: facebook.com/chintana
> twitter: twitter.com/std_err
>
|