cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pål Andreassen <pal.andreas...@bouvet.no>
Subject Cassandra 2.2 for time series
Date Mon, 31 Aug 2015 13:48:12 GMT
Hi

I'm currently evaluating Cassandra as a potiantial database for storing time series data from
lots of devices (IoT type of scenario).
Currently we have a few thousand devices with X channels (measurements) that they report at
different intervals (from 5 minutes and up).

I've created as simple test table to store the data:

CREATE TABLE DataRaw(
  channelId int,
  sampleTime timestamp,
  value double,
  PRIMARY KEY (channelId, sampleTime)
) WITH CLUSTERING ORDER BY (sampleTime ASC);

This schema seems to work ok, but I have queries that I need to support that I cannot easily
figure out how to perform (except getting all the data out and iterate it myself).

Query 1: For max and min queries, I not only want the maximum/minimum value, but also the
corresponding timestamp.


sampleTime          value

2015-08-28 00:00    10

2015-08-28 01:00    15

2015-08-28 02:00    13

I'd like the max query to return both 2015-08-28 01:00 and 15. SELECT sampleTime, max(value)
FROM DataRAW return the max value, but the first sampleTime.
Also I wonder if Cassandra has built-in support for interpolation/extrapolation. Some sort
of group by hour/day/week/month and even year function.

Query 2: Give me hourly averages for channel X for yesterday. I'd expect to get 24 values
each of which is the hourly average. Or give my daily averages for last year for a given channel.
Should return 365 daily averages.

Best regards

Pål Andreassen
54°23'58"S 3°18'53"E
Konsulent
Mobil +47 982 85 504
pal.andreassen@bouvet.no<mailto:pal.andreassen@bouvet.no>

Bouvet Norge AS
Avdeling Grenland
Uniongata 18, Klosterøya
N-3732 Skien
Tlf +47 23 40 60 00
bouvet.no<http://www.bouvet.no/?utm_campaign=e-post&utm_source=ansatt&utm_medium=email>


Mime
View raw message