Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6C121DC17 for ; Thu, 13 Dec 2012 15:21:23 +0000 (UTC) Received: (qmail 43324 invoked by uid 500); 13 Dec 2012 15:21:20 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 43266 invoked by uid 500); 13 Dec 2012 15:21:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 43250 invoked by uid 99); 13 Dec 2012 15:21:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Dec 2012 15:21:18 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.223.176 as permitted sender) Received: from [209.85.223.176] (HELO mail-ie0-f176.google.com) (209.85.223.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Dec 2012 15:21:14 +0000 Received: by mail-ie0-f176.google.com with SMTP id 13so4049911iea.35 for ; Thu, 13 Dec 2012 07:20:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=kkdE2V4VZWOxBcvZhX+Jg6Ekzqd149fDCVSwG/TrHv8=; b=pbpX638+OUGezXfvk6BrOgZJx36zvMLKRSEyKgXDUSj6S7DOIbzBUH7PjE0Y1x5Ea8 /MMU3BTF4HzlxLVb5EapXVaBohLCjJX/NLmMkzEMsqRVPVzgGx+HKqjcfKKelqqmKZEP 0E4aYh0HZEzJXt262bFgC2j78SFNyzvtBDYvZ3AGyYYw/Q1xWMo8Uhts081bZKoXXPL6 J5/hHNjrX4qmQCHRlQiTgudlKuo6MdB+CkENBjJCSXkJ+x36rvkElAS3wskHT0wXLy8L nbJjibdGSBfjZBYDMQlgOc8wKNM63ZjRyBzRQ7AHBOgADSWRFPYo7G7brYvThH0V/Io0 Md0w== MIME-Version: 1.0 Received: by 10.50.57.200 with SMTP id k8mr17316820igq.29.1355412053575; Thu, 13 Dec 2012 07:20:53 -0800 (PST) Received: by 10.64.97.106 with HTTP; Thu, 13 Dec 2012 07:20:53 -0800 (PST) In-Reply-To: References: Date: Thu, 13 Dec 2012 10:20:53 -0500 Message-ID: Subject: Re: Why Secondary indexes is so slowly by my test? From: Edward Capriolo To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=14dae9340f8bfa460804d0bd758e X-Virus-Checked: Checked by ClamAV on apache.org --14dae9340f8bfa460804d0bd758e Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Until the secondary indexes do not read before write is in a release and stabilized you should follow Ed ENuff s blog and do your indexing yourself with composites. On Thursday, December 13, 2012, aaron morton wrote: > The IndexClause for the get_indexed_slices takes a start key. You can page the results from your secondary index query by making multiple calls with a sane count and including a start key. > Cheers > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > @aaronmorton > http://www.thelastpickle.com > On 13/12/2012, at 6:34 PM, Chengying Fang wrote: > > You are right, Dean. It's due to the heavy result returned by query, not index itself. According to my test, if the result rows less than 5000, it's very quick. But how to limit the result? It seems row limit is a good choice. But if do so, some rows I wanted maybe miss because the row order not fulfill query conditions. > For example: CF User{I1,C1} with Index I1. Query conditions:I1=3Dfoo, ord= er by C1. If I1=3Dfoo return 10000 limit 100, I can't get the right result of C1. Also we can not always set row range fulfill the query conditions when doing query. Maybe I should redesign the CF model to fix it. > > ------------------ Original ------------------ > From: "Hiller, Dean"; > Date: Wed, Dec 12, 2012 10:51 PM > To: "user@cassandra.apache.org"; > Subject: Re: Why Secondary indexes is so slowly by my test? > > You could always try PlayOrm's query capability on top of cassandra ;)=85.it works for us. > > Dean > > From: Chengying Fang > > Reply-To: "user@cassandra.apache.org" < user@cassandra.apache.org> > Date: Tuesday, December 11, 2012 8:22 PM > To: user > > Subject: Re: Why Secondary indexes is so slowly by my test? > > Thanks to Low. We use CompositeColumn to substitue it in single not-equality and definite equalitys query. And we will give up cassandra because of the weak query ability and unstability. Many times, we found our data in confusion without definite cause in our cluster. For example, only two rows in one CF, row1-columnname1-columnvalue1,row2-columnname2-columnvalue2, but some times, it becomes row1-columnname1-columnvalue2,row2-columnname2-columnvalue1. Notice the wrong column value. > > > ------------------ Original ------------------ > From: "Richard Low">; > Date: Tue, Dec 11, 2012 07:44 PM > To: "user">; > Subject: Re: Why Secondary indexes is so slowly by my test? > > Hi, > > Secondary index lookups are more complicated than normal queries so will be slower. Items have to first be queried in the index, then retrieved from their actual location. Also, inserting into indexed CFs will be slower (but will get substantially faster in 1.2 due --14dae9340f8bfa460804d0bd758e Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Until the secondary indexes do not read before write is in a release and st= abilized you should follow Ed ENuff s blog and do your indexing yourself wi= th composites.

On Thursday, December 13, 2012, aaron morton <aaron@thelastpickle.com> wrot= e:
> The IndexClause for the get_indexed_slices takes a start key. You can = page the results from your secondary index query by making multiple calls w= ith a sane count and including a start key.=A0
> Cheers
> -----= ------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand=
> @aaronmorton
> http= ://www.thelastpickle.com
> On 13/12/2012, at 6:34 PM, Chengying F= ang <cyfang@ngnsoft.com> wr= ote:
>
> You are right, Dean.=A0It's due to the heavy result return= ed by query, not index itself. According to my test, if the result =A0rows = less than 5000, it's very quick. But how to limit the=A0result? It seem= s row limit is a good choice. But if do so, some rows I wanted =A0maybe mis= s because the row order not fulfill query conditions.
> For example: CF User{I1,C1} with Index I1. Query conditions:I1=3Dfoo, = order by C1. If I1=3Dfoo return 10000 limit 100, I can't get the right = result=A0of C1. Also we can not always set row range fulfill the query cond= itions when doing query. Maybe I should redesign the CF model to fix it. > =A0
> ------------------=A0Original=A0------------------
>= From: =A0"Hiller, Dean"<Dean.Hiller@nrel.gov>;
> Date: =A0Wed, Dec 12, 2012 10:51 P= M
> To: =A0"user@ca= ssandra.apache.org"<user@cassandra.apache.org>;
> Subject: =A0Re: Why Secondary indexes is so slowly by my test?
>= =A0
> You could always try PlayOrm's query capability on top of = cassandra ;)=85.it works for us.
>
> Dean
>
> From:= Chengying Fang <cyfang@ngnsoft.co= m<mailto:cyfang@ngnsoft.com>>
> Reply-To: "
user@cass= andra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Tuesday, December 11, 2012 8:22 PM
> To: user <user@cassandra.apache.org<mail= to:user@cassandra.apache.org>>
> Subject: Re: Why Secondary indexes is so slowly by my test?
>> Thanks to Low. We use CompositeColumn to substitue it in single not-e= quality and definite equalitys query. And we will give up cassandra because= of the weak query ability and unstability. Many times, we found our data i= n confusion without definite=A0 cause in our cluster. For example, only two= rows in one CF, row1-columnname1-columnvalue1,row2-columnname2-columnvalue= 2, but some times, it becomes row1-columnname1-columnvalue2,row2-columnname= 2-columnvalue1. Notice the wrong column value.
>
>
> ------------------ Original ------------------
>= From:=A0 "Richard Low"<
rlow= @acunu.com<mailto:rlow@acunu.com>>;
> Date:=A0 Tue, Dec 11, 2012 07:44 PM
> To:=A0 "user"<= ;
user@cassandra.apache.org= <mailto:user@cassandra.apac= he.org>>;
> Subject:=A0 Re: Why Secondary indexes is so slowly by my test?
>=
> Hi,
>
> Secondary index lookups are more complicated t= han normal queries so will be slower. Items have to first be queried in the= index, then retrieved from their actual location. Also, inserting into ind= exed CFs will be slower (but will get substantially faster in 1.2 due --14dae9340f8bfa460804d0bd758e--