cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Nguyen <andrew-lists-cassan...@ucsfcti.org>
Subject Best way to store millisecond-accurate data
Date Sat, 24 Apr 2010 00:01:04 GMT
Hello,

I am looking to store patient physiologic data in Cassandra - it's being collected at rates
of 1 to 125 Hz.  I'm thinking of storing the timestamps as the column names and the patient/parameter
combo as the row key.  For example, Bob is in the ICU and is currently having his blood pressure,
intracranial pressure, and heart rate monitored.  I'd like to collect this with the following
row keys:

Bob-bloodpressure
Bob-intracranialpressure
Bob-heartrate

The column names would be timestamps but that's where my questions start:

I'm not sure what the best data type and CompareWith would be.  From my searching, it sounds
like the TimeUUID may be suitable but isn't really designed for millisecond accuracy.  My
other thought is just to store them as strings (2010-04-23 10:23:45.016).  While I space isn't
the foremost concern, we will be collecting this data 24/7 so we'll be creating many columns
over the long-term.  

I found https://issues.apache.org/jira/browse/CASSANDRA-16 which states that the entire row
must fit in memory.  Does this include the values as well as the column names?

In considering the limits of cassandra and the best way to model this, we would be adding
3.9 billion rows per year (assuming 125 Hz @ 24/7).  However, I can't really think of a better
way to model this...  So, am I thinking about this all wrong or am I on the right track?

Thanks,
Andrew
Mime
View raw message