Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C1B192F3 for ; Tue, 22 May 2012 12:04:35 +0000 (UTC) Received: (qmail 87701 invoked by uid 500); 22 May 2012 12:04:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 87678 invoked by uid 500); 22 May 2012 12:04:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 87656 invoked by uid 99); 22 May 2012 12:04:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 May 2012 12:04:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of filippo@ntoklo.com designates 209.85.215.172 as permitted sender) Received: from [209.85.215.172] (HELO mail-ey0-f172.google.com) (209.85.215.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 May 2012 12:04:24 +0000 Received: by eaaq13 with SMTP id q13so1750023eaa.31 for ; Tue, 22 May 2012 05:04:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:x-mailer :mime-version:content-type:x-gm-message-state; bh=5n0Ckh+2KIpWp2BNgPToHgfuAKwX5gGh7WY6jHXheBE=; b=YIiZDqfTaT/9xNl/Wrg5OG24ktbkiiGpCHsqYl3uHlx8FsEl621XCzsOJA/3PpoWNi BNbkNnxalA9Igfv0oZ2u+noqBEj3EPqu7vUjtHXvKdFb4fpu0GByZt4F4hXmCwum4hGj TprQDnyiBuhOUnqFXdpEIuwBPMIo0rO+Y9hiOKw5h6qFKExF0y3ebhZNQ0yrueLEzL2k N38sYOh08uX3LZbMvQaQM6uNBQY9/oiGxm2/0raIPYodH0gRyUZhx3IuqtdphWZbSdIA DHdyPDbUru3zG/osfVa5+uIYly7+7qk9gX0URLsH+z4gnIRNCIZujC8GQ0QkfZuYf+68 MAjA== Received: by 10.213.14.13 with SMTP id e13mr5043713eba.74.1337688243001; Tue, 22 May 2012 05:04:03 -0700 (PDT) Received: from Filippos-MacBook.local ([80.71.29.65]) by mx.google.com with ESMTPS id z47sm105661650een.5.2012.05.22.05.04.01 (version=SSLv3 cipher=OTHER); Tue, 22 May 2012 05:04:02 -0700 (PDT) Date: Tue, 22 May 2012 13:03:58 +0100 From: Filippo Diotalevi To: user@cassandra.apache.org Message-ID: <520F8B47B1EE422781E36EFA6443C176@ntoklo.com> In-Reply-To: References: Subject: Re: RE Ordering counters in Cassandra X-Mailer: sparrow 1.5 (build 1043.1) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="4fbb80ae_5451cf49_424" X-Gm-Message-State: ALoCoQlNYCCq88sVsjeObwN+qk+0+MrQAuATBAo66TBuQ5Y3vR1DRXSjcYELZoW783zD/aKERT+C --4fbb80ae_5451cf49_424 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Thanks for all the answers, they definitely helped. =20 Just out of curiosity, is there any underlying architectural reason why i= t's not possible to order a row based on its counters values=3F or is it = something that might be in the roadmap in the future=3F =20 -- =20 =46ilippo Diotalevi On Tuesday, 22 May 2012 at 08:48, Romain HARDOUIN wrote: > =20 > I mean iterate over each column -- more precisly: *bunches of columns* = using slices -- and write new columns in the inversed index. =20 > Tamar's data model is made for real time analysis. It's maybe overdesig= ned for a daily ranking. =20 > I agree with Samal, you should split your data across the space of toke= ns. Only C=46 Ranking feeding would be affected, not the =22top N=22 quer= ies. =20 > =20 > =46ilippo Diotalevi a =C3=A9crit sur 21/05/2012 19:05:28 : > =20 > > Hi Romain, =20 > > thanks for your suggestion. =20 > > =20 > > When you say =22 build every day a ranking in a dedicated C=46 by =20 > > iterating over events:=22 do you mean =20 > > - load all the columns for the specified row key =20 > > - iterate over each column, and write a new column in the inversed in= dex =20 > > =3F =20 > > =20 > > That's my current approach, but since I have many of these wide rows > > (1 per day), the process is extremely slow as it involves moving an =20 > > entire row from Cassandra to client, inverting every column, and =20 > > sending the data back to create the inversed index. =20 --4fbb80ae_5451cf49_424 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
Thanks for all the answers, they definitely helped.

Just out of curiosity, is there= any underlying architectural reason why it's not possible to order a row= based on its counters values=3F or is it something that might be in the = roadmap in the future=3F

-- 
=46ilippo Dio= talevi

=20

On Tuesday, 22 May 201= 2 at 08:48, Romain HARDOUIN wrote:


I mean iterate over each= column -- more precisly: *bunches of columns* using slices -- and write new columns in the inversed index.
Tamar's data model is ma= de for real time analysis. It's maybe overdesigned for a daily ranking.
I agree with Samal, you = should split your data across the space of tokens. Only C=46 Ranking feeding would be affected, not the =22top N=22 queries.

=46ilippo Diotalevi <filippo=40ntoklo.com> a =C3=A9crit sur 21/05/2012 19:05:28 :

> Hi Romain,

> thanks for your suggestion.=
>
> When you say =22 build every day a ranking in a dedicated C=46 by
> iterating over events:=22 do you mean

> - load all the columns for the specifie= d row key
> - iterate over each column, and write a= new column in the inversed index
> =3F
>
> That's my current approach, but since I have many of these wide rows=
> (1 per day), the process is extremely slow as it involves moving an
> entire row from Cassandra to client, inverting every column, and > sending the data back to create the inversed index.

=20 =20 =20 =20
=20

--4fbb80ae_5451cf49_424--