Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14594108C5 for ; Tue, 10 Sep 2013 22:33:25 +0000 (UTC) Received: (qmail 68259 invoked by uid 500); 10 Sep 2013 22:33:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68233 invoked by uid 500); 10 Sep 2013 22:33:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68225 invoked by uid 99); 10 Sep 2013 22:33:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 22:33:22 +0000 X-ASF-Spam-Status: No, hits=-6.5 required=5.0 tests=ENV_AND_HDR_SPF_MATCH,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,USER_IN_DEF_SPF_WL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michael.laing@nytimes.com designates 209.85.214.174 as permitted sender) Received: from [209.85.214.174] (HELO mail-ob0-f174.google.com) (209.85.214.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 22:33:15 +0000 Received: by mail-ob0-f174.google.com with SMTP id wd6so7793389obb.33 for ; Tue, 10 Sep 2013 15:32:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=tvr+gHMnuZdm4/2YUhQ0Kh8XytiN1UIgscqjp8kne1I=; b=YJdJO2FD2AP7B00lZbjoL4OwhDN66jEnLnY18jTvqWTq0eI/aw3zcjOQxZY2TSv98w 88tF6CM2mOfGj6KNdn32KeGbstD84ymOQzwBKkdv3qIJwUHDrkxftqYNCp7zb2CnZvty lhGKzritokktXIriGJRQn4WYQHxWhSR48F315h5lo6aOKAsDuElkGxMNyh4F4dk/Xfnm Azm/92VZXQlpDDE/6FVVgt5JphCpiB4xzUZfUDFDEjSIPVf486R2O8WDofsyi/upF+m6 erifRHga0rbq+ZU+L9jLVLNHYYzDxoobIVwGl+5G2802GODpdhNGkqdZfIBMHKLzyqW5 2jFA== X-Gm-Message-State: ALoCoQnva07m3VlvTJ4ClILsklINR2RnzoO1vatq2UUN0WpFMPkepgA2ZWHYpTIWAXRcBmdV3cZ4 MIME-Version: 1.0 X-Received: by 10.182.134.229 with SMTP id pn5mr101778obb.88.1378852373028; Tue, 10 Sep 2013 15:32:53 -0700 (PDT) Received: by 10.182.107.134 with HTTP; Tue, 10 Sep 2013 15:32:52 -0700 (PDT) In-Reply-To: References: Date: Tue, 10 Sep 2013 18:32:52 -0400 Message-ID: Subject: Re: Composite Column Grouping From: "Laing, Michael" To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11c2570ee4729704e60f15a6 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2570ee4729704e60f15a6 Content-Type: text/plain; charset=UTF-8 You could try this. C* doesn't do it all for you, but it will efficiently get you the right data. -ml -- put this in and run using 'cqlsh -f DROP KEYSPACE latest; CREATE KEYSPACE latest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 }; USE latest; CREATE TABLE time_series ( userid text, pkid text, colname map, PRIMARY KEY (userid, pkid) ); UPDATE time_series SET colname = colname + {'200':'Col-Name-1'} WHERE userid = 'XYZ' AND pkid = '1000'; UPDATE time_series SET colname = colname + {'201':'Col-Name-2'} WHERE userid = 'XYZ' AND pkid = '1001'; UPDATE time_series SET colname = colname + {'202':'Col-Name-3'} WHERE userid = 'XYZ' AND pkid = '1000'; UPDATE time_series SET colname = colname + {'203':'Col-Name-4'} WHERE userid = 'XYZ' AND pkid = '1000'; UPDATE time_series SET colname = colname + {'204':'Col-Name-5'} WHERE userid = 'XYZ' AND pkid = '1002'; SELECT * FROM time_series WHERE userid = 'XYZ'; -- returns: -- userid | pkid | colname ----------+------+----------------------------------------------------------------- -- XYZ | 1000 | {'200': 'Col-Name-1', '202': 'Col-Name-3', '203': 'Col-Name-4'} -- XYZ | 1001 | {'201': 'Col-Name-2'} -- XYZ | 1002 | {'204': 'Col-Name-5'} -- use an app to pop off the latest key/value from the map for each row, then sort by key desc. On Tue, Sep 10, 2013 at 9:21 AM, Ravikumar Govindarajan < ravikumar.govindarajan@gmail.com> wrote: > I have been faced with a problem of grouping composites on the second-part. > > Lets say my CF contains this > > > TimeSeriesCF > key: UserID > composite-col-name: TimeUUID:PKID > > Some sample data > > UserID = XYZ > Time:PKID > Col-Name1 = 200:1000 > Col-Name2 = 201:1001 > Col-Name3 = 202:1000 > Col-Name4 = 203:1000 > Col-Name5 = 204:1002 > > Whenever a time-series query is issued, it should return the following in > time-desc order. > > UserID = XYZ > Col-Name5 = 204:1002 > Col-Name4 = 203:1000 > Col-Name2 = 201:1001 > > Is something like this possible in Cassandra? Is there a different way to > design and achieve the same objective? > > -- > Ravi > > --001a11c2570ee4729704e60f15a6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
You could try this. C* doesn't do it all for you, but = it will efficiently get you the right data.

-ml

-- put this in <file> and run using 'c= qlsh -f <file>

DROP KEYSPACE latest;

CREATE K= EYSPACE latest WITH replication =3D {
=C2=A0 =C2=A0 'class= 9;: 'SimpleStrategy',=C2=A0
=C2=A0 =C2=A0 'replicatio= n_factor' : 1
};

USE latest;

CREAT= E TABLE time_series (
=C2=A0 =C2=A0 userid text,
=C2=A0= =C2=A0 pkid text,
=C2=A0 =C2=A0 colname map<text, text>,
=C2=A0 =C2=A0 PRIMARY KEY (userid, pkid)
);

UPDATE time_series SET colname =3D colname= + {'200':'Col-Name-1'} WHERE userid =3D 'XYZ' AND = pkid =3D '1000';
UPDATE time_series=C2=A0SET=C2=A0colname= =3D colname + {'201':'Col-Name-2'}=C2=A0WHERE=C2=A0userid = =3D 'XYZ'=C2=A0AND=C2=A0pkid =3D '1001';
UPDATE time_series=C2=A0SET=C2=A0colname =3D colname + {'202':= 'Col-Name-3'}=C2=A0WHERE=C2=A0userid =3D 'XYZ'=C2=A0AND=C2= =A0pkid =3D '1000';
UPDATE time_series=C2=A0SET=C2=A0coln= ame =3D colname + {'203':'Col-Name-4'}=C2=A0WHERE=C2=A0user= id =3D 'XYZ'=C2=A0AND=C2=A0pkid =3D '1000';
UPDATE time_series=C2=A0SET=C2=A0colname =3D colname + {'204':= 'Col-Name-5'}=C2=A0WHERE=C2=A0userid =3D 'XYZ'=C2=A0AND=C2= =A0pkid =3D '1002';

SELECT * FROM time_ser= ies WHERE userid =3D 'XYZ';

-- returns:
-- userid | pkid | colname
<= div>----------+------+-----------------------------------------------------= ------------
-- =C2=A0 =C2=A0XYZ | 1000 | {'200': 'Co= l-Name-1', '202': 'Col-Name-3', '203': 'Col= -Name-4'}
-- =C2=A0 =C2=A0XYZ | 1001 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 {'201': 'Col-Name-2'}
=
-- =C2=A0 =C2=A0XYZ | 1002 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 {'204': 'Col-Name-5'}
=

-- use an app to pop off the latest key/value from the map f= or each row, then sort by key desc.


On Tue, Sep 10, 2013 at 9:21 = AM, Ravikumar Govindarajan <ravikumar.govindarajan@gmail.co= m> wrote:
I have been faced with a pr= oblem of grouping composites on the second-part.

Lets sa= y my CF contains this


TimeSeriesCF
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0key: =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0UserID
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0composite-col-name: =C2=A0 =C2=A0TimeUUID:PKID

<= /div>
Some sample data

UserID =3D XYZ =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0Time:PKID
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Col-Name1 =3D 200:10= 00
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Col-Nam= e2 =3D 201:1001
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0Col-Name3 =3D 202:1000
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0Col-Name4 =3D 203:1000
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Col-Name5 =3D 204:1002

Whenever a time-series query is issued, it should retur= n the following in time-desc order.

UserID =3D XYZ=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Col-Name5 =3D 2= 04:1002
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Col-Name= 4 =3D 203:1000
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Col-Name2 =3D 201:100= 1

Is something like this possible in Cassandra? Is= there a different way to design and achieve the same objective?
=
--
Ravi
=C2=A0 =C2=A0

--001a11c2570ee4729704e60f15a6--