Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com
 designates 209.85.223.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <FF63202A-C151-42F6-BDA6-8D93C9C87F81@thelastpickle.com>
References: <tencent_19ABD4C958036AC22265A110@qq.com>
	<FF63202A-C151-42F6-BDA6-8D93C9C87F81@thelastpickle.com>
Date: Thu, 13 Dec 2012 10:20:53 -0500
Message-ID: 
 <CAENxBwx0BBLm-Kff2A39RZhrtE77JDo-5zpmq+-jGZb7VF42Ww@mail.gmail.com>
Subject: Re: Why Secondary indexes is so slowly by my test?
From: Edward Capriolo <edlinuxguru@gmail.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=14dae9340f8bfa460804d0bd758e

--14dae9340f8bfa460804d0bd758e
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Until the secondary indexes do not read before write is in a release and
stabilized you should follow Ed ENuff s blog and do your indexing yourself
with composites.

On Thursday, December 13, 2012, aaron morton <aaron@thelastpickle.com>
wrote:
> The IndexClause for the get_indexed_slices takes a start key. You can
page the results from your secondary index query by making multiple calls
with a sane count and including a start key.
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> @aaronmorton
> http://www.thelastpickle.com
> On 13/12/2012, at 6:34 PM, Chengying Fang <cyfang@ngnsoft.com> wrote:
>
> You are right, Dean. It's due to the heavy result returned by query, not
index itself. According to my test, if the result  rows less than 5000,
it's very quick. But how to limit the result? It seems row limit is a good
choice. But if do so, some rows I wanted  maybe miss because the row order
not fulfill query conditions.
> For example: CF User{I1,C1} with Index I1. Query conditions:I1=3Dfoo, ord=
er
by C1. If I1=3Dfoo return 10000 limit 100, I can't get the right result of
C1. Also we can not always set row range fulfill the query conditions when
doing query. Maybe I should redesign the CF model to fix it.
>
> ------------------ Original ------------------
> From:  "Hiller, Dean"<Dean.Hiller@nrel.gov>;
> Date:  Wed, Dec 12, 2012 10:51 PM
> To:  "user@cassandra.apache.org"<user@cassandra.apache.org>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
>
> You could always try PlayOrm's query capability on top of cassandra
;)=85.it works for us.
>
> Dean
>
> From: Chengying Fang <cyfang@ngnsoft.com<mailto:cyfang@ngnsoft.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Tuesday, December 11, 2012 8:22 PM
> To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: Why Secondary indexes is so slowly by my test?
>
> Thanks to Low. We use CompositeColumn to substitue it in single
not-equality and definite equalitys query. And we will give up cassandra
because of the weak query ability and unstability. Many times, we found our
data in confusion without definite  cause in our cluster. For example, only
two rows in one CF,
row1-columnname1-columnvalue1,row2-columnname2-columnvalue2, but some
times, it becomes
row1-columnname1-columnvalue2,row2-columnname2-columnvalue1. Notice the
wrong column value.
>
>
> ------------------ Original ------------------
> From:  "Richard Low"<rlow@acunu.com<mailto:rlow@acunu.com>>;
> Date:  Tue, Dec 11, 2012 07:44 PM
> To:  "user"<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
>
> Hi,
>
> Secondary index lookups are more complicated than normal queries so will
be slower. Items have to first be queried in the index, then retrieved from
their actual location. Also, inserting into indexed CFs will be slower (but
will get substantially faster in 1.2 due

--14dae9340f8bfa460804d0bd758e
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Until the secondary indexes do not read before write is in a release and st=
abilized you should follow Ed ENuff s blog and do your indexing yourself wi=
th composites.<br><br>On Thursday, December 13, 2012, aaron morton &lt;<a h=
ref=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt; wrot=
e:<br>
&gt; The IndexClause for the get_indexed_slices takes a start key. You can =
page the results from your secondary index query by making multiple calls w=
ith a sane count and including a start key.=A0<br>&gt; Cheers<br>&gt; -----=
------------<br>
&gt; Aaron Morton<br>&gt; Freelance Cassandra Developer<br>&gt; New Zealand=
<br>&gt; @aaronmorton<br>&gt; <a href=3D"http://www.thelastpickle.com">http=
://www.thelastpickle.com</a><br>&gt; On 13/12/2012, at 6:34 PM, Chengying F=
ang &lt;<a href=3D"mailto:cyfang@ngnsoft.com">cyfang@ngnsoft.com</a>&gt; wr=
ote:<br>
&gt;<br>&gt; You are right, Dean.=A0It&#39;s due to the heavy result return=
ed by query, not index itself. According to my test, if the result =A0rows =
less than 5000, it&#39;s very quick. But how to limit the=A0result? It seem=
s row limit is a good choice. But if do so, some rows I wanted =A0maybe mis=
s because the row order not fulfill query conditions.<br>
&gt; For example: CF User{I1,C1} with Index I1. Query conditions:I1=3Dfoo, =
order by C1. If I1=3Dfoo return 10000 limit 100, I can&#39;t get the right =
result=A0of C1. Also we can not always set row range fulfill the query cond=
itions when doing query. Maybe I should redesign the CF model to fix it.<br=
>
&gt; =A0<br>&gt; ------------------=A0Original=A0------------------<br>&gt;=
 From: =A0&quot;Hiller, Dean&quot;&lt;<a href=3D"mailto:Dean.Hiller@nrel.go=
v">Dean.Hiller@nrel.gov</a>&gt;;<br>&gt; Date: =A0Wed, Dec 12, 2012 10:51 P=
M<br>&gt; To: =A0&quot;<a href=3D"mailto:user@cassandra.apache.org">user@ca=
ssandra.apache.org</a>&quot;&lt;<a href=3D"mailto:user@cassandra.apache.org=
">user@cassandra.apache.org</a>&gt;;<br>
&gt; Subject: =A0Re: Why Secondary indexes is so slowly by my test?<br>&gt;=
 =A0<br>&gt; You could always try PlayOrm&#39;s query capability on top of =
cassandra ;)=85.it works for us.<br>&gt;<br>&gt; Dean<br>&gt;<br>&gt; From:=
 Chengying Fang &lt;<a href=3D"mailto:cyfang@ngnsoft.com">cyfang@ngnsoft.co=
m</a>&lt;mailto:<a href=3D"mailto:cyfang@ngnsoft.com">cyfang@ngnsoft.com</a=
>&gt;&gt;<br>
&gt; Reply-To: &quot;<a href=3D"mailto:user@cassandra.apache.org">user@cass=
andra.apache.org</a>&lt;mailto:<a href=3D"mailto:user@cassandra.apache.org"=
>user@cassandra.apache.org</a>&gt;&quot; &lt;<a href=3D"mailto:user@cassand=
ra.apache.org">user@cassandra.apache.org</a>&lt;mailto:<a href=3D"mailto:us=
er@cassandra.apache.org">user@cassandra.apache.org</a>&gt;&gt;<br>
&gt; Date: Tuesday, December 11, 2012 8:22 PM<br>&gt; To: user &lt;<a href=
=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&lt;mail=
to:<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</=
a>&gt;&gt;<br>
&gt; Subject: Re: Why Secondary indexes is so slowly by my test?<br>&gt;<br=
>&gt; Thanks to Low. We use CompositeColumn to substitue it in single not-e=
quality and definite equalitys query. And we will give up cassandra because=
 of the weak query ability and unstability. Many times, we found our data i=
n confusion without definite=A0 cause in our cluster. For example, only two=
 rows in one CF, row1-columnname1-columnvalue1,row2-columnname2-columnvalue=
2, but some times, it becomes row1-columnname1-columnvalue2,row2-columnname=
2-columnvalue1. Notice the wrong column value.<br>
&gt;<br>&gt;<br>&gt; ------------------ Original ------------------<br>&gt;=
 From:=A0 &quot;Richard Low&quot;&lt;<a href=3D"mailto:rlow@acunu.com">rlow=
@acunu.com</a>&lt;mailto:<a href=3D"mailto:rlow@acunu.com">rlow@acunu.com</=
a>&gt;&gt;;<br>
&gt; Date:=A0 Tue, Dec 11, 2012 07:44 PM<br>&gt; To:=A0 &quot;user&quot;&lt=
;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>=
&lt;mailto:<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apac=
he.org</a>&gt;&gt;;<br>
&gt; Subject:=A0 Re: Why Secondary indexes is so slowly by my test?<br>&gt;=
<br>&gt; Hi,<br>&gt;<br>&gt; Secondary index lookups are more complicated t=
han normal queries so will be slower. Items have to first be queried in the=
 index, then retrieved from their actual location. Also, inserting into ind=
exed CFs will be slower (but will get substantially faster in 1.2 due

--14dae9340f8bfa460804d0bd758e--