Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2FB48102EC for ; Sat, 23 Nov 2013 03:03:55 +0000 (UTC) Received: (qmail 65828 invoked by uid 500); 23 Nov 2013 03:02:24 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65770 invoked by uid 500); 23 Nov 2013 03:02:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65742 invoked by uid 99); 23 Nov 2013 03:02:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 03:02:17 +0000 X-ASF-Spam-Status: No, hits=-6.5 required=5.0 tests=ENV_AND_HDR_SPF_MATCH,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,USER_IN_DEF_SPF_WL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michael.laing@nytimes.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 03:02:11 +0000 Received: by mail-ob0-f176.google.com with SMTP id va2so2137248obc.21 for ; Fri, 22 Nov 2013 19:01:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=S2r1rQrxfc877wIflvdPbqVYe3Y3QFYZcuBGfXfdpwQ=; b=FrXpJjjSlUOPokElfnCuFWiDXci7KFK9633Qe79ovFo3eDywzQbY1udnf/ufAUlNr3 JKP8NxU/nzCrTLGf4V2PdptN5pT8N48Y/UqwNvEgr3q8y4vObfEDJWsfTYVQBneg0UA0 UpKheutqIWp2h1p0Ahr1ekWNLJ/1JKVJb8tcTnvhVDieQaAvx9bw45HmITEJ3CIce4Bs MqLNAv6h4hfdFUQvhOLQyoo6tXbm9CdjMWcBR42DxZCi/h/qy6XL7w02mX/mCVu4CrKH evjmGRIGkoQ2ZzIZTBBcezOTIg6BnHEAlfpjKz5If3oolG2rhOtSIaXJQuKBi1sT3JI0 E0+w== X-Gm-Message-State: ALoCoQlor8cFkEayfmMmMy/oBDayEGX3vQv+Ld+iiUCFvHxLBj1wHhfidmKsouCgDkeXoZUU4Q49 MIME-Version: 1.0 X-Received: by 10.60.117.38 with SMTP id kb6mr13478003oeb.7.1385175708281; Fri, 22 Nov 2013 19:01:48 -0800 (PST) Received: by 10.182.107.134 with HTTP; Fri, 22 Nov 2013 19:01:48 -0800 (PST) In-Reply-To: References: Date: Fri, 22 Nov 2013 22:01:48 -0500 Message-ID: Subject: Re: CQL and counters From: "Laing, Michael" To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b414d240b3de104ebcf5a8a X-Virus-Checked: Checked by ClamAV on apache.org --047d7b414d240b3de104ebcf5a8a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Here's another example that may help: -- put this in AND run using 'cqlsh -f DROP KEYSPACE bryce_test; CREATE KEYSPACE bryce_test WITH replication =3D { 'class': 'SimpleStrategy', 'replication_factor' : 1 }; USE bryce_test; CREATE TABLE samples ( name text, bucket text, count counter, total counter, PRIMARY KEY (name, bucket) ); UPDATE samples SET count =3D count + 1, total =3D total + 1 WHERE name=3D't= est' AND bucket=3D'2013-11-22T19:00'; UPDATE samples SET count =3D count + 1, total =3D total + 2 WHERE name=3D't= est' AND bucket=3D'2013-11-22T19:00'; UPDATE samples SET count =3D count + 1, total =3D total + 3 WHERE name=3D't= est' AND bucket=3D'2013-11-22T19:15'; UPDATE samples SET count =3D count + 1, total =3D total + 4 WHERE name=3D't= est' AND bucket=3D'2013-11-22T19:30'; UPDATE samples SET count =3D count + 1, total =3D total + 5 WHERE name=3D't= est' AND bucket=3D'2013-11-22T19:30'; SELECT * FROM samples; SELECT * FROM samples WHERE name =3D 'test' AND bucket >=3D '2013-11-22T19:30' AND bucket <=3D '2013-11-22T19:45'; -- returns: -- name | bucket | count | total --------+------------------+-------+------- -- test | 2013-11-22T19:00 | 2 | 3 -- test | 2013-11-22T19:15 | 1 | 3 -- test | 2013-11-22T19:30 | 2 | 9 --(3 rows) -- name | bucket | count | total --------+------------------+-------+------- -- test | 2013-11-22T19:30 | 2 | 9 --(1 rows) On Fri, Nov 22, 2013 at 7:21 PM, Tyler Hobbs wrote: > Something like this would work: > > CREATE TABLE foo ( > interface text, > property text, > bucket timestamp, > count counter, > PRIMARY KEY ((interface, property), bucket) > ) > > interface is 'NIC1' and property is 'Total' or 'Count'. > > To query over a date range, you'd run a query like: > > SELECT bucket, count FROM foo WHERE interface=3D'NIC1' AND property=3D'to= tal' > AND bucket > '2013-11-22 10:00:00' AND bucket < '2013-11-22 12:00:00'; > > > On Fri, Nov 22, 2013 at 4:48 PM, Bryce Godfrey wrote: > >> I=E2=80=99m looking for some guidance on how to model some stat trackin= g over >> time, bucketed to some type of interval (15 min, hour, etc). >> >> >> >> As an example, let=E2=80=99s say I would like to track network traffic t= hroughput >> and bucket it to 15 minute intervals. In our old model, using thrift I >> would create a column family set to counter, and use a timestamp ticks f= or >> the column name for a =E2=80=9Ctotal=E2=80=9D and =E2=80=9Ccount=E2=80= =9D column. And as data was sampled, >> we would increment count by one, and increment the total with the sample= d >> value for that time bucket. The column name would give us the datetime = for >> the values, as well as provide me with a convenient row slice query to g= et >> a date range for any given statistic. >> >> >> >> Key | 1215 | 1230 | 1245 >> >> NIC1:Total | 100 | 56 | 872 >> >> NIC1:Count | 15 | 15 | 15 >> >> >> >> Then given the total/count I can show an average over time. >> >> >> >> In CQL it seems like I can=E2=80=99t do new counter columns at runtime u= nless >> they are defined in the schema first or run an ALTER statement, which ma= y >> not be the correct way to go. So is there a better way to model this ty= pe >> of data with the new CQL world? Nor do I know how to query that type of >> data, similar to the row slice by column name. >> >> >> >> Thanks, >> >> Bryce >> > > > > -- > Tyler Hobbs > DataStax > --047d7b414d240b3de104ebcf5a8a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Here's another example that may help:

-- put this in <file> AND run using 'cqlsh -f <file>= ;

DROP KEYSPACE bryce_test;

CREATE KEYSPACE bryce_test WITH replication =3D {
=C2=A0 =C2=A0 &= #39;class': 'SimpleStrategy',=C2=A0
=C2=A0 =C2=A0 = 9;replication_factor' : 1
};

USE bry= ce_test;

CREATE TABLE samples (
=C2=A0 =C2=A0 name text,
<= div>=C2=A0 =C2=A0 bucket text,
=C2=A0 =C2=A0 count counter,
=
=C2=A0 =C2=A0 total counter,
=C2=A0 =C2=A0 PRIMARY KEY (name= , bucket)
);

UPDATE samples SET count =3D count + 1, total =3D total + 1 WHERE name=3D&#= 39;test' AND bucket=3D'2013-11-22T19:00';
UPDATE samp= les SET count =3D count + 1, total =3D total + 2 WHERE name=3D'test'= ; AND bucket=3D'2013-11-22T19:00';
UPDATE samples SET count =3D count + 1, total =3D total + 3 WHERE name= =3D'test' AND bucket=3D'2013-11-22T19:15';
UPDATE= samples SET count =3D count + 1, total =3D total + 4 WHERE name=3D'tes= t' AND bucket=3D'2013-11-22T19:30';
UPDATE samples SET count =3D count + 1, total =3D total + 5 WHERE name= =3D'test' AND bucket=3D'2013-11-22T19:30';

SELECT * FROM samples;

SELECT * FROM samp= les=C2=A0
WHERE=C2=A0
=C2=A0 =C2=A0 name =3D 'test'=C2=A0
=C2=A0 =C2=A0 AND bucket >=3D '2013-11-22T19:30'=C2=A0
=C2=A0 =C2=A0 AND bucket <=3D '2013-11-22T19:45';

-- returns:

-- name | bucket =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | count | total
--------+------------------+-------+-------
-- test | 201= 3-11-22T19:00 | =C2=A0 =C2=A0 2 | =C2=A0 =C2=A0 3
-- test | 2013-= 11-22T19:15 | =C2=A0 =C2=A0 1 | =C2=A0 =C2=A0 3
-- test | 2013-11= -22T19:30 | =C2=A0 =C2=A0 2 | =C2=A0 =C2=A0 9

--(3 rows)


-- n= ame | bucket =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | count | total
-= -------+------------------+-------+-------
-- test | 2013-11-22T1= 9:30 | =C2=A0 =C2=A0 2 | =C2=A0 =C2=A0 9

--(1 rows)



On Fri, Nov 22, 2013 at= 7:21 PM, Tyler Hobbs <tyler@datastax.com> wrote:
Something like this would w= ork:

CREATE TABLE = foo (
=C2=A0=C2=A0=C2=A0 interface text,
=C2=A0=C2=A0=C2=A0 property text,
= =C2=A0=C2=A0=C2=A0 bucket timestamp,
=C2=A0=C2=A0=C2=A0 count counter,=C2=A0=C2=A0=C2=A0 PRIMARY KEY ((interface, property),
bucket)
)


interface is 'NIC1' and property is 'Total'= or 'Count'.

To query over a date range, you'd run a que= ry like:

SELECT bucket, count FROM foo WHERE interface= =3D'NIC1' AND property=3D'total' AND bucket > '2013-11-22 10:00:00' AND bucket < '2013-11-22 12:00:00';


On Fri, Nov 22, 2013 at 4:48 PM, Bryce Godfrey <bryce.godfrey@avanade.com> wrote:

I=E2=80=99m looking for some guidance on how to mode= l some stat tracking over time, bucketed to some type of interval (15 min, = hour, etc).=C2=A0

=C2=A0

As an example, let=E2=80=99s say I would like to tra= ck network traffic throughput and bucket it to 15 minute intervals.=C2=A0 I= n our old model, using thrift I would create a column family set to counter= , and use a timestamp ticks for the column name for a =E2=80=9Ctotal=E2=80=9D and =E2=80=9Ccount=E2=80=9D column.=C2=A0 An= d as data was sampled, we would increment count by one, and increment the t= otal with the sampled value for that time bucket.=C2=A0 The column name wou= ld give us the datetime for the values, as well as provide me with a conven= ient row slice query to get a date range for any given statistic.=

=C2=A0

Key=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 1215 =C2=A0| 1230 | 1245=

NIC1:Total =C2=A0=C2=A0| 100 =C2=A0=C2=A0=C2=A0| 56= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 872

NIC1:Count | 15=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 15= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 15

=C2=A0

Then given the total/count I can show an average ove= r time.

=C2=A0

In CQL it seems like I can=E2=80=99t do new counter = columns at runtime unless they are defined in the schema first or run an AL= TER statement, which may not be the correct way to go.=C2=A0 So is there a = better way to model this type of data with the new CQL world?=C2=A0 Nor do I know how to query that type of data, similar= to the row slice by column name.

=C2=A0

Thanks,

Bryce




--
Tyler Hobbs

DataStax

--047d7b414d240b3de104ebcf5a8a--