Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C024A9A0A for ; Tue, 27 Mar 2012 09:31:53 +0000 (UTC) Received: (qmail 66300 invoked by uid 500); 27 Mar 2012 09:31:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 66281 invoked by uid 500); 27 Mar 2012 09:31:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 66260 invoked by uid 99); 27 Mar 2012 09:31:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Mar 2012 09:31:50 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ertiop93@gmail.com designates 209.85.160.44 as permitted sender) Received: from [209.85.160.44] (HELO mail-pb0-f44.google.com) (209.85.160.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Mar 2012 09:31:44 +0000 Received: by pbbrq13 with SMTP id rq13so7492243pbb.31 for ; Tue, 27 Mar 2012 02:31:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=L8Be5PlecUETSa7/dQfTeGUj1RhC7EjGQ9qRqIwtzGM=; b=Md9jyB9UWr3UyfOUL1eqAzjmTxPfUxJDWT1cXMKMY/5jx+DB38htkwIilLwTBvObBv S4yRIJXlexoiPQQy2fH9vIN4HmE56DM3COJ83AZ+123NkYHGHYqzCyYl4GbwyDQXb1kK tbWNx9Dmtof6UH+9P+GF56iDUl8TEF8XBXt1sP3HXEFiv5chUsG/gRTrt8bkcLGy6pBG ulrNXVuKEhUoUT8pJC1+9LgD9O35XeMvn8m16Gdk9mrUmazmKhSUO8nVA3Vmfgpa5S13 XTgBxFo8xJx0g7SI9JNF3gUH6HlaH5KtpASN+xcCQvDwPK+qgmLgvjJXgXe2fjYXjY9g MnJg== MIME-Version: 1.0 Received: by 10.68.221.40 with SMTP id qb8mr60721857pbc.154.1332840683231; Tue, 27 Mar 2012 02:31:23 -0700 (PDT) Received: by 10.68.239.100 with HTTP; Tue, 27 Mar 2012 02:31:23 -0700 (PDT) In-Reply-To: References: <4F7175DC.5040804@gmail.com> Date: Tue, 27 Mar 2012 15:01:23 +0530 Message-ID: Subject: Re: Schema advice/help From: Ertio Lew To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8ff256a277588004bc3627a5 --e89a8ff256a277588004bc3627a5 Content-Type: text/plain; charset=ISO-8859-1 @R. Verlangen: You are suggesting to keep a single row for all activities & read all the columns from the row & then filter, right!? If done that way (instead of keeping it in 5 rows) then I would need to retrieve 100s-200s of columns from single row rather than just 50 columns if I keep in 5 rows.. Which of these two would be better ? More columns from single row OR less columns from multiple rows ? On Tue, Mar 27, 2012 at 2:27 PM, R. Verlangen wrote: > You can just get a slice range with as start "userId:" and no end. > > > 2012/3/27 Maciej Miklas > >> multiget would require Order Preserving Partitioner, and this can lead to >> unbalanced ring and hot spots. >> >> Maybe you can use secondary index on "itemtype" - is must have small >> cardinality: >> http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/ >> >> >> >> >> On Tue, Mar 27, 2012 at 10:10 AM, Guy Incognito wrote: >> >>> without the ability to do disjoint column slices, i would probably use 5 >>> different rows. >>> >>> userId:itemType -> activityId >>> >>> then it's a multiget slice of 10 items from each of your 5 rows. >>> >>> >>> On 26/03/2012 22:16, Ertio Lew wrote: >>> >>>> I need to store activities by each user, on 5 items types. I always >>>> want to read last 10 activities on each item type, by a user (ie, total >>>> activities to read at a time =50). >>>> >>>> I am wanting to store these activities in a single row for each user so >>>> that they can be retrieved in single row query, since I want to read all >>>> the last 10 activities on each item.. I am thinking of creating composite >>>> names appending "itemtype" : "activityId"(activityId is just timestamp >>>> value) but then, I don't see about how to read the last 10 activities from >>>> all itemtypes. >>>> >>>> Any ideas about schema to do this better way ? >>>> >>> >>> >> > > > -- > With kind regards, > > Robin Verlangen > www.robinverlangen.nl > > --e89a8ff256a277588004bc3627a5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
@R. Verlangen:
You are suggesting to keep a si= ngle row for all activities & read all the columns from the row & t= hen filter, right!?=A0

If done that way (instead of keeping it in 5 rows) then I wo= uld need to retrieve 100s-200s of columns from single row rather than just = 50 columns if I keep in 5 rows.. Which of these two would be better ? More = columns from single row OR less columns from multiple rows ?

On Tue, Mar 27, 2012 at 2:27 PM, R. Verlange= n <robin@us2.nl>= ; wrote:
You can just get a slice range with as start "userId:" and no end= .


2012/3/27 Maciej Miklas <mac.miklas@googlemail.com>=
multiget would require Order Preserving Part= itioner, and this can lead to unbalanced ring and hot spots.

Maybe y= ou can use secondary index on "itemtype" - is must have small car= dinality: http://pkghosh.wordpress.com/2011/= 03/02/cassandra-secondary-index-patterns/




On Tue, Mar 27, 2012 at 10:10 AM, Gu= y Incognito <dnd1066@gmail.com> wrote:
without the ability to do disjoint column slices, i would probably use 5 di= fferent rows.

userId:itemType -> activityId

then it's a multiget slice of 10 items from each of your 5 rows.


On 26/03/2012 22:16, Ertio Lew wrote:
I need to store activities by each user, on 5 items types. I always want to= read last 10 activities on each item type, by a user (ie, total activities= to read at a time =3D50).

I am wanting to store these activities in a single row for each user so tha= t they can be retrieved in single row query, since I want to read all the l= ast 10 activities on each item.. I am thinking of creating composite names = appending "itemtype" : "activityId"(activityId is just = timestamp value) but then, I don't see about how to read the last 10 ac= tivities from all itemtypes.

Any ideas about schema to do this better way ?





<= /div>--
With kind regard= s,

Robin Verlangen


--e89a8ff256a277588004bc3627a5--