cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Recommandation on how to organize CF
Date Fri, 20 May 2011 03:14:26 GMT
I'm a bit confused by your examples. I think you are saying...

- Standard CF called Message using the UTF8Type for column comparisons used to store the individual
messages. Row key is the message UUID. Not sure what the columns are. 
- Standard CF called MessageTime using TimeUUIDType for columns comparison uses to store collections
of messages. Row key is "messagelist:<message_list_uuid>" for a message list, and "messagebox:<user_name>:<mbox_name>"
for message box. Not sure what the columns are. 

The best model is going to be the one that supports your read requests and the volume of data
your are expecting. 

One way to go is to de normalise to support very fast read paths. You could store the entire
message in one column using something like JSON to serialise it. Then

- MessageIndexes standard CF to store the full messages in context, there are three different
types of rows:
	* keys with <user_name>  store all messages for a user, column name is the message
TimeUUID and value is the message structure
	* keys with <user_name>/<mbox_name> store the messages for a single message box.
Columns same as below. 
	* keys with <user_name>/<mbox_name>/<mlist_name> store the messages in
a single message list. Columns as above. 

- MessageFolders CF to store the message box and message lists, two approaches:
	1) <user_name> as key and each column is a message box, message lists are stored in
a single column as JSON
	2) <user_name> row for the top level message box, column for each message box. <user_name>/<message_box>
for the next level, 

Or if space is a concern just store the UUID of the message in the index CF and add a CF to
store the messages. 

It also going to depend on the management features, e.g. can you rename a message box / list
? Move messages around ? If so the de normalised pattern may not be the best as those operations
will take longer. 

Hope that helps. 

Aaron Morton
Freelance Cassandra Developer

On 19 May 2011, at 05:44, openvictor Open wrote:

> Hello all,
> I know organization is a broad topic and everybody may have an idea on how to do it,
but I really want to have some advices and opinions and I think it could be interesting to
discuss this matter.
> Here is my problem: I am designing a messaging system internal to a website. There are
3 big structures which are Message, MessageList, MessageBox. A message/messagelist is identified
only by an UUID; a MessageBox is identified by a name(utf8 string). A messagebox has a set
of MessageList in it and a messagelist has a set of message in it, all of them being UUIDs.
> Currently I have only two CF : message and message_time. Message is a UTF8Type (cassandra
0.6.11, soon going for 0.8) and message_time is a TimeUUIDType.
> For example if I want to request all message in a certain messagelist I do : message_time['messagelist:uuid(messagelist)']
> If I want information of a mesasge I do message['message:uuid(message)']
> If I want all messagelist for a certain messagebox ( called nameofbox for user openvictor
for this example) I do : message_time['messagebox:openvictor:nameofbox']
> My question to Cassandra users is : is it a good idea to regroup all those things into
two CF ? Is there some advantages / drawbacks of this two CFs and for long term should I change
my organization ?
> Thank you,
> Victor

View raw message