cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tristan Seligmann <mithra...@mithrandi.net>
Subject Help with schema modelling
Date Sun, 17 Jul 2011 00:08:47 GMT
I'm trying to model a schema for a logging storage system in
Cassandra: Log messages consist of a timestamp, message, and some
other arbitrary key/value pairs. Querying would primarily be done
based on timestamp ranges; I will probably be doing filtering based on
matches against the key/value pairs as well, but I expect that will be
handled by fetching the messages in the desired time range, then
filtering out the uninteresting ones.

A supercolumn makes it easy enough to store the key/value pairs as
columns, but then I end up with all of my log messages in a single
row, which obviously won't work. On the other hand, if I use the
timestamp as the row key, I need to use OPP to query on ranges, and
I'd prefer not to deal with the balancing issues that would raise. I
suppose I could go halfway; use a prefix of the timestamp (eg. date +
hour, or perhaps date + hour + minute) as the key, and then retrieve
all of the keys in the range I'm interested in when performing a
query.

I feel like I'm missing something, though, so I was hoping for some
advice from more experienced users of Cassandra.
-- 
mithrandi, i Ainil en-Balandor, a faer Ambar

Mime
View raw message