incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Data model question, storing Queue Message
Date Mon, 30 Apr 2012 03:52:16 GMT
Message Queue is often not a great use case for Cassandra. For information on how to handle
high delete workloads see http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra

It hard to create a model without some idea of the data load, but I would suggest you start
with:

CF: UserMessages
Key: ReceiverID
Columns : column name = TimeUUID ; column value = message ID and Body

That will order the messages by time. 

Depending on load (and to support deleting a previous months messages) you may want to partition
the rows by month:

CF: UserMessagesMonth
Key: ReceiverID+YYYYMM
Columns : column name = TimeUUID ; column value = message ID and Body

Everything the same as before. But now a user has a row for each month and which you can delete
as a whole. This also helps avoid very big rows. 

> I really don't think that storage will be an issue, I have 2TB per nodes, messages are
1KB limited.
I would suggest you keep the per node limit to 300 to 400 GB. It can take a long time to compact,
repair and move the data when it gets above 400GB. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/04/2012, at 1:30 AM, Morgan Segalis wrote:

> Hi everyone !
> 
> I'm fairly new to cassandra and I'm not quite yet familiarized with column oriented NoSQL
model.
> I have worked a while on it, but I can't seems to find the best model for what I'm looking
for.
> 
> I have a Erlang software that let user connecting and communicate with each others, when
an user (A) sends
> a message to a disconnected user (B), it stores it on the database and wait for the user
(B) to connect and retrieve
> the message queue, and deletes it. 
> 
> Here's some key point : 
> - Users are identified by integer IDs
> - Each message are unique by combination of : Sender ID - Receiver ID - Message ID -
time
> 
> I have a queue Message, and here's the operations I would need to do as fast as possible
: 
> 
> - Store from 1 to X messages per registered user
> - Get the number of stored messages per user (Can be a incremental variable updated at
each store // this is often retrieved)
> - retrieve all messages from an user at once.
> - delete all messages from an user at once.
> - delete all messages that are older than Y months (from all users).
> 
> I really don't think that storage will be an issue, I have 2TB per nodes, messages are
1KB limited.
> I'm really looking for speed rather than storage optimization.
> 
> My configuration is 2 dedicated server which are both :
> - 4 x Intel i7 2.66 Ghz
> - 64 bits
> - 24 Go
> - 2 TB
> 
> Thank you all.


Mime
View raw message