Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 314F8C514 for ; Wed, 16 May 2012 08:24:12 +0000 (UTC) Received: (qmail 53159 invoked by uid 500); 16 May 2012 08:24:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 52803 invoked by uid 500); 16 May 2012 08:24:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 52771 invoked by uid 99); 16 May 2012 08:24:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 08:24:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cyril.auburtin@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gg0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 08:23:59 +0000 Received: by ggnc4 with SMTP id c4so502794ggn.31 for ; Wed, 16 May 2012 01:23:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=4sKObTaUktYfMOAsIkAJouK9z46EfxwW8hK6yvcCB7M=; b=dwV0xYYd63f0OSQZf/hSRXQZQvLQVwoRj78Yt33JNmOVYdvgTmJQvuhzgXOv5Y26wZ SBni4ftwmF4CzsoLwOgwBQWcdcRKRCIZLGElqDSDF8L+wYBejIHo9bdLVAElTzo+Jb1v eu0HXLJpYjtKNssE/Iap/wiYRCbiFp8ip+/X5gJVoidfer0ZUlzRV6kkbx1XwkU9nJon whYzPm7nGLY6FTJ3EeT3yy50Dckar/mDo0spMzbI/qyc7q7rFcHMw+GmW2SMK/yFqJQg 5wcW3db8NAjsyh+QsEI58DxSb55S0Km1oSQjICnfXLsJwkH98EuqfTsysK3XCkBmvaJT YH3g== Received: by 10.236.161.72 with SMTP id v48mr1196204yhk.107.1337156618950; Wed, 16 May 2012 01:23:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.167.5 with HTTP; Wed, 16 May 2012 01:23:17 -0700 (PDT) In-Reply-To: <01FEEAC1-5C21-4817-BFC8-C6B046570ADC@thelastpickle.com> References: <4FAD2C6B.3090208@mebigfatguy.com> <01FEEAC1-5C21-4817-BFC8-C6B046570ADC@thelastpickle.com> From: Cyril Auburtin Date: Wed, 16 May 2012 10:23:17 +0200 Message-ID: Subject: Re: primary keys query To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf302ef83e4817d304c023091e --20cf302ef83e4817d304c023091e Content-Type: text/plain; charset=ISO-8859-1 tx was looking at http://code.google.com/p/javageomodel/ too 2012/5/14 aaron morton > So it seems it's not a good idea, to use Cassandra like that? > > Right. It's basically a table scan. > > Here is some background on the approach simple geo took to using > Cassandra... > http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php > > Also PostGis for Postgress seems popular http://postgis.refractions.net/ > > Hope that helps. > > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 12/05/2012, at 4:23 AM, cyril auburtin wrote: > > I was thinking of a CF with many many rows with id, type, latitude and > longitude (indexed), and do geolocation queries: type=all and lat < 43 and > lat >42.9 and lon < 7.3 and lon > 7.2 > > where all rows have type=all > (at least try how Cassandra deals with that) > So it seems it's not a good idea, to use Cassandra like that? > > There's also the possibly to do in parallel, other CF, with latitude in > rows, that will be sorted, so an indexed query can give us the right > latidue range, and then just query with logitude < and > > > What do you think of that > > thanks > > 2012/5/11 Dave Brosius > >> Inequalities on secondary indices are always done in memory, so without >> at least one EQ on another secondary index you will be loading every row in >> the database, which with a massive database isn't a good idea. So by >> requiring at least one EQ on an index, you hopefully limit the set of rows >> that need to be read into memory to a manageable size. Although obviously >> you can still get into trouble with that as well. >> >> >> >> >> On 05/11/2012 09:39 AM, cyril auburtin wrote: >> >>> Sorry for askign that >>> but Why is it necessary to always have at least one EQ comparison >>> >>> [default@Keyspace1] get test where birth_year>1985; >>> No indexed columns present in index clause with operator EQ >>> >>> It oblige to have one dummy indexed column, to do this query >>> >>> [default@Keyspace1] get test where tag=sea and birth_year>1985; >>> ------------------- >>> RowKey: sam >>> => (column=birth_year, value=1988, timestamp=1336742346059000) >>> >>> >>> >> > > --20cf302ef83e4817d304c023091e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable tx was looking at=A0http= ://code.google.com/p/javageomodel/=A0too

2012/5/14 aaron morton <aaron@thelastpickle.com>
So it seems it's not a go= od idea, to use Cassandra like that?
Right. It's basically a table s= can.=A0

Here is some background on the= approach simple geo took to using Cassandra...

Also PostGis for Postgress seems popular=A0http://postgis.refraction= s.net/

Hope that helps.=A0


<= div style=3D"word-wrap:break-word">
-----------------
Aaron Morton
Freelance Deve= loper
@aaronmorton

On 12/05/2012, at 4:23 AM, cyril auburtin wrote:

I was thinking of a CF with many many rows with id, = type, latitude and longitude (indexed), and do geolocation queries: type=3D= all and lat < 43 and lat >42.9 and lon < 7.3 and lon > 7.2

where all rows have type=3Dall
(at least try how Cassandra deals with that)
So it seems it&= #39;s not a good idea, to use Cassandra like that?

There's also the possibly to do in parallel, other CF, with latitude i= n rows, that will be sorted, so an indexed query can give us the right lati= due range, and then just query with logitude < and >

What do you think of that

than= ks

2012/5/11 Dave Brosius <= dbrosius@mebigfatguy.com>
Inequalities on secondary indices are always= done in memory, so without at least one EQ on another secondary index you = will be loading every row in the database, which with a massive database is= n't a good idea. So by requiring at least one EQ on an index, you hopef= ully limit the set of rows that need to be read into memory to a manageable= size. Although obviously you can still get into trouble with that as well.=




On 05/11/2012 09:39 AM, cyril auburtin wrote:
Sorry for askign that
but Why is it necessary to always have at least one EQ comparison

[default@Keyspace1] get test where birth_year>1985;
=A0 =A0No indexed columns present in index clause with operator EQ

It oblige to have one dummy indexed column, to do this query

[default@Keyspace1] get test where tag=3Dsea and birth_year>1985;
-------------------
RowKey: sam
=3D> (column=3Dbirth_year, value=3D1988, timestamp=3D1336742346059000)





--20cf302ef83e4817d304c023091e--