cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Kutcharian <>
Subject Revised: Data Modeling advise for Cassandra 0.8 (added #8)
Date Wed, 30 Mar 2011 04:13:10 GMT
I'm pretty new to Cassandra and I would like to get your advice on modeling. The object model
of the project that I'm working on will be pretty close to Blogger, Tumblr, etc. (or any other
blogging website).
Where you have Users, that each can have many Blogs and each Blog can have many comments.
How would you model this efficiently considering:

1) Be able to directly link to a User
2) Be able to directly link to a Blog
3) Be able to query and get all the Blogs for a User ordered by time created descending (new
blogs first)
4) Be able to query and get all the Comments for each Blog ordered by time created ascending
(old comments first)
5) Be able to link different Users to each other, as a network.
6) Have a well distributed hash so we don't end up with "hot" nodes, while the rest of the
nodes are idle
7) It would be nice to show a User how many Blogs they have or how many comments are on a
Blog, without iterating thru the whole dataset.
NEW: 8) Be able to query for the most recently added Blogs. For example, Blogs added today,
this week, this month, etc.

The target Cassandra version is 0.8 to use the Secondary Indexes. The goal is to be very efficient,
so no Text keys. We were thinking of using Time Based 64bit ids, using Snowflake.


View raw message