cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Lin <wool...@gmail.com>
Subject Re: Cassandra 2.2 for time series
Date Mon, 31 Aug 2015 14:12:00 GMT
I didn't realize they had added max and min as stock functions.

to get the sample time. you'll probably need to write a custom function.
google for it and you'll find people that have done it.

On Mon, Aug 31, 2015 at 10:09 AM, Pål Andreassen <pal.andreassen@bouvet.no>
wrote:

> Cassandra 2.2 has min and max built-in. My problem is getting the
> corresponding sample time as well.
>
>
>
> *Pål Andreassen*
>
> *54°23'58"S 3°18'53"E*
>
> *Konsulent*
>
> Mobil +47 982 85 504
>
> pal.andreassen@bouvet.no
>
>
>
>
> *Bouvet Norge AS Avdeling Grenland*
>
> Uniongata 18, Klosterøya
>
> N-3732 Skien
>
> Tlf +47 23 40 60 00
>
> *bouvet.no*
> <http://www.bouvet.no/?utm_campaign=e-post&utm_source=ansatt&utm_medium=email>
>
>
>
> *From:* Peter Lin [mailto:woolfel@gmail.com]
> *Sent:* mandag 31. august 2015 16.09
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra 2.2 for time series
>
>
>
>
>
> Unlike SQL, CQL doesn't have built-in functions like max/min
>
> In the past, people would create summary tables to keep rolling stats for
> reports/analytics. In cql3, there's user defined functions, so you can
> write a function to do max/min
>
> http://cassandra.apache.org/doc/cql3/CQL-2.2.html#selectStmt
> http://cassandra.apache.org/doc/cql3/CQL-2.2.html#udfs
>
>
>
> On Mon, Aug 31, 2015 at 9:48 AM, Pål Andreassen <pal.andreassen@bouvet.no>
> wrote:
>
> Hi
>
>
>
> I’m currently evaluating Cassandra as a potiantial database for storing
> time series data from lots of devices (IoT type of scenario).
>
> Currently we have a few thousand devices with X channels (measurements)
> that they report at different intervals (from 5 minutes and up).
>
>
>
> I’ve created as simple test table to store the data:
>
>
>
> CREATE TABLE DataRaw(
>
>   channelId int,
>
>   sampleTime timestamp,
>
>   value double,
>
>   PRIMARY KEY (channelId, sampleTime)
>
> ) WITH CLUSTERING ORDER BY (sampleTime ASC);
>
>
>
> This schema seems to work ok, but I have queries that I need to support
> that I cannot easily figure out how to perform (except getting all the data
> out and iterate it myself).
>
>
>
> Query 1: For max and min queries, I not only want the maximum/minimum
> value, but also the corresponding timestamp.
>
>
>
> sampleTime          value
>
> 2015-08-28 00:00    10
>
> 2015-08-28 01:00    15
>
> 2015-08-28 02:00    13
>
>
> I'd like the max query to return both 2015-08-28 01:00 and 15. SELECT
> sampleTime, max(value) FROM DataRAW return the max value, but the first
> sampleTime.
>
> Also I wonder if Cassandra has built-in support for
> interpolation/extrapolation. Some sort of group by hour/day/week/month and
> even year function.
>
>
>
> Query 2: Give me hourly averages for channel X for yesterday. I’d expect
> to get 24 values each of which is the hourly average. Or give my daily
> averages for last year for a given channel. Should return 365 daily
> averages.
>
>
>
> Best regards
>
>
>
> *Pål Andreassen*
>
> *54°23'58"S 3°18'53"E*
>
> *Konsulent*
>
> Mobil +47 982 85 504
>
> pal.andreassen@bouvet.no
>
>
>
>
> *Bouvet Norge AS Avdeling Grenland*
>
> Uniongata 18, Klosterøya
>
> N-3732 Skien
>
> Tlf +47 23 40 60 00
>
> *bouvet.no*
> <http://www.bouvet.no/?utm_campaign=e-post&utm_source=ansatt&utm_medium=email>
>
>
>
>
>

Mime
View raw message