incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Reavely <simon.reav...@gmail.com>
Subject Re: Schema question
Date Tue, 21 Sep 2010 10:20:57 GMT
Thanks for the writeup...good stuff!
Any lessons learnt you'd like to share or challenges that persist?


Simon Reavely


On Sep 20, 2010, at 6:37 AM, Juho Mäkinen <juho.makinen@gmail.com> wrote:

> We have built a facebook style "messenger" into our web site which
> uses cassandra as storage backend with two column families:
> TalkMessages and TalkLastMessages. I've uploaded a screenshot showing
> the feature in action to
> http://img138.imageshack.us/img138/3807/talkexample.jpg
> 
> TalkMessages contains each message between two participants. The key
> is a string built from the two users uids "$smaller_uid:$bigger_uid".
> Each column inside this CF contains a single message. The column name
> is the message timestamp in microseconds since epoch stored as
> LongType. The column value is a JSON encoded string containing
> following fields: sender_uid, target_uid, msg.
> 
> This results in following structure inside the column family.
> 
> "2249:9111" => [
>  12345678 : { sender_uid : 2249, target_uid : 9111, msg : "Hello, how
> are you?" },
>  12345679 : { sender_uid : 9111, target_uid : 2249, msg : "I'm fine, thanks" }
> ]
> 
> TalkLastMessages is used to quickly fetch users talk partners, the
> last message which was sent between the peers and other similar data.
> This allows us to quickly fetch all needed data which is needed to
> display a "main view" for all online friends with just one query to
> cassandra. This column family uses the user uid as is key. Each column
> represents a talk partner whom the user has been talking to and it
> uses the talk partner uid as the column name. Column value is a json
> packed structure which contains following fields:
> - last message timestamp: microseconds since epoch when a message was
> last sent between these two users.
> - unread timestamp : microseconds since epoch when the first unread
> message was sent between these two users.
> - unread : counter how many unread messages there are.
> - last message : last message between these two users.
> 
> This results in following structure inside the column family for these
> two example users: 2249 and 9111.
> 
> "2249" => [
>  9111 : { last_message_timestamp : 12345679, unread_timestamp :
> 12345679, unread : 1, last_message: "I'm fine, thanks" }
> 
> ],
> "9111" => [
>  2249 : { last_message_timestamp :  12345679, unread_timestamp :
> 12345679, unread : 0, last_message: "I'm fine, thanks" }
> ]
> 
> Displaying chat (this happends on every page load, needs to be fast)
> 1) Fetch all columns from TalkLastMessages for the user
> 
> Display messages history between two participants:
> 1) Fetch last n columns from TalkMessages for the relevant
> "$smaller_uid:$bigger_uid" row.
> 
> Mark all sent messages from another participant as read (when you read
> the messages)
> 1) Get column $sender_uid from row $reader_uid from TalkLastMessages
> 2) Update the JSON payload and insert the column back
> 
> Sending message involves the following operations:
> 1) Insert new column to TalkMessages
> 2) Fetch relevant column from TalkLastMessages from $target_uid row
> with $sender_uid column
> 3) Update the column json payload and insert it back to TalkLastMessages
> 4) Fetch relevant column from TalkLastMessages from $sender_uid row
> with $target_uid column
> 5) Update the column json payload and insert it back to TalkLastMessages
> 
> There are also other operations and the actual payload is a bit more complex.
> 
> I'm happy to answer questions if somebody is interested :)
> 
> - Juho Mäkinen
> 
> 
> 
> On Mon, Sep 20, 2010 at 12:57 PM, Morten Wegelbye Nissen <mwn@monit.dk> wrote:
>>  Hello List,
>> 
>> No matter where you read, you almost every-where read the the noSQL
>> datascema is completely different from the relational way - and after a
>> little insight in cassandra everyone can 2nd that.
>> 
>> But I miss to see some real-life examples on how a real system can be
>> modelled. Lets take the example for a system where users can send messages
>> to each other. ( Completely imaginary, noone would use cassandra for a
>> mailsystem :) )
>> 
>> If one should create such a system, what CF's would be used? And how would
>> you per example find all not read messages?
>> 
>> ./Morten
>> 

Mime
View raw message