cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshni Rajagopal <>
Subject RE: Data Modeling: Comments with Voting
Date Tue, 02 Oct 2012 03:19:56 GMT

Hi , 
To explain my suggestions - my thoughts were 
a) you need to store entity type information about a comment like date created, comment text,
commented by etc. I cant think of any other master information for a comment, but in general
one starts with entities in a standard static column family.  If you store an entity in a
dynamic denormailized form, if any master data changes you would need to iterate across all
rows and update it which is expensive in cassandra. Here comment text is editable.
b) So when a comment is created it goes to the static column family. Also an entry is made
in the dynamic sort_by_time_list column family with column as time created. I didn't suggest
a and c be clubbed so that master information remains in one place. The other approach would
be to have a comment stored as a JSON in the column value. However if you need to update comment
text    , it would be hard to identify the comment column and update it. c) when a comment
gets a vote, the counter column family is incremented to know the number of votes for a comment.
Also to sort by number of votes  , after incrementing the counter you need to write the current
number of votes, and the comment id in the column family d. But I see now that you also need
to delete the old number of votes & comment id column and add a new  column with current
number of votes and comment id. It would be sorted by number of votes.
If there are many ways to sort, its better to do it in the application to avoid having a new
column family for each type of sort...however Im not certain over time and volume which approach
would perform better.Sorting can be complex - aaron's blog post
Welcome any feedback on my suggestions.

Subject: Re: Data Modeling: Comments with Voting
Date: Tue, 2 Oct 2012 10:39:42 +1300

You cannot (and probably do not want to) sort continually when the voting is going on. 
You can store the votes using CounterColumnTypes in column values. When someone votes you
then (somehow) queue a job that will read the vote counts for the post / comment, pivot and
sort on the vote count, and then write the updated leader board to cassandra. 
Alternatively if you have a small number of comments for a post just read all the votes and
sort them as part of the read. 

-----------------Aaron MortonFreelance Developer@aaronmorton

On 30/09/2012, at 8:25 AM, Drew Kutcharian <> wrote:Thanks Roshni,
I'm not sue how #d will work when users are actually voting on a comment. What happens when
two users vote on the same comment simultaneously? How do you update the entries in #d column
family to prevent duplicates?
 Also #a and #c can be combined together using TimeUUID as comment ids.
- Drew

On Sep 27, 2012, at 2:13 AM, Roshni Rajagopal <> wrote:

Hi Drew,
I think you have 4 requirements. Here are my suggestions.
a) store comments : have a static column family for comments with master data like created
date, created by , length etcb) when a person votes for a comment, increment a vote counter
: have a counter column family for incrementing the votes for each commentc) display comments
sorted by date created: have a column family with a dummy row id  'sort_by_time_list',  column
names can be date created(timeUUID), and column value can be comment id d) display comments
sorted by number of votes: have a column family with a dummy row id 'sort_by_votes_list' and
column names can be a composite of number of votes , and comment id ( as more than 1 comment
can have the same votes)


> Date: Wed, 26 Sep 2012 17:36:13 -0700
> From:
> To:
> CC:
> Subject: Re: Data Modeling: Comments with Voting
> Depending on your needs, you could simply duplicate the comments in two 
> separate CFs with the column names including time in one and the vote in 
> the other. If you allow for updates to the comments, that would pose 
> some issues you'd need to solve at the app level.
> On 9/26/12 4:28 PM, Drew Kutcharian wrote:
> > Hi Guys,
> >
> > Wondering what would be the best way to model a flat (no sub comments, i.e. twitter)
comments list with support for voting (where I can sort by create time or votes) in Cassandra?
> >
> > To demonstrate:
> >
> > Sorted by create time:
> > - comment 1 (5 votes)
> > - comment 2 (1 votes)
> > - comment 3 (no votes)
> > - comment 4 (10 votes)
> >
> > Sorted by votes:
> > - comment 4 (10 votes)
> > - comment 1 (5 votes)
> > - comment 2 (1 votes)
> > - comment 3 (no votes)
> >
> > It's the sorted-by-votes that I'm having a bit of a trouble with. I'm looking for
a roll-your-own approach and prefer not to use secondary indexes and CQL sorting.
> >
> > Thanks,
> >
> > Drew
> >

View raw message