incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Pierre Bergamin <>
Subject Re: Time-series data model
Date Thu, 15 Apr 2010 09:27:47 GMT
Am 14.04.2010 15:22, schrieb Ted Zlatanov:
> On Wed, 14 Apr 2010 15:02:29 +0200 "Jean-Pierre Bergamin"<>  wrote:
> JB>  The metrics are stored together with a timestamp. The queries we want to
> JB>  perform are:
> JB>   * The last value of a specific metric of a device
> JB>   * The values of a specific metric of a device between two timestamps t1 and
> JB>  t2
> Make your key "devicename-metricname-YYYYMMDD-HHMM" (with whatever time
> sharding makes sense to you; I use UTC by-hours and by-day in my
> environment).  Then your supercolumn is the collection time as a
> LongType and your columns inside the supercolumn can express the metric
> in detail (collector agent, detailed breakdown, etc.).
Just for my understanding. What is "time sharding"? I couldn't find an 
explanation somewhere. Do you mean that the time-series data is rolled 
up in 5 minues, 1 hour, 1 day etc. slices?

So this would be defined as:
<ColumnFamily Name="measurements" ColumnType="Super" 
CompareWith="UTF8Type"  CompareSubcolumnsWith="LongType" />

So when i want to read all values of one metric between two timestamps 
t0 and t1, I'd have to read the supercolumns that match a key range 
(device1:metric1:t0 - device1:metric1:t1) and then all the supercolumns 
for this key?


View raw message