incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Why Secondary indexes is so slowly by my test?
Date Thu, 13 Dec 2012 07:07:41 GMT
The IndexClause for the get_indexed_slices takes a start key. You can page the results from
your secondary index query by making multiple calls with a sane count and including a start
key. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/12/2012, at 6:34 PM, Chengying Fang <cyfang@ngnsoft.com> wrote:

> You are right, Dean. It's due to the heavy result returned by query, not index itself.
According to my test, if the result  rows less than 5000, it's very quick. But how to limit
the result? It seems row limit is a good choice. But if do so, some rows I wanted  maybe miss
because the row order not fulfill query conditions.
> For example: CF User{I1,C1} with Index I1. Query conditions:I1=foo, order by C1. If I1=foo
return 10000 limit 100, I can't get the right result of C1. Also we can not always set row
range fulfill the query conditions when doing query. Maybe I should redesign the CF model
to fix it.
>  
> ------------------ Original ------------------
> From:  "Hiller, Dean"<Dean.Hiller@nrel.gov>;
> Date:  Wed, Dec 12, 2012 10:51 PM
> To:  "user@cassandra.apache.org"<user@cassandra.apache.org>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
>  
> You could always try PlayOrm's query capability on top of cassandra ;)….it works for
us.
> 
> Dean
> 
> From: Chengying Fang <cyfang@ngnsoft.com<mailto:cyfang@ngnsoft.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Tuesday, December 11, 2012 8:22 PM
> To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: Why Secondary indexes is so slowly by my test?
> 
> Thanks to Low. We use CompositeColumn to substitue it in single not-equality and definite
equalitys query. And we will give up cassandra because of the weak query ability and unstability.
Many times, we found our data in confusion without definite  cause in our cluster. For example,
only two rows in one CF, row1-columnname1-columnvalue1,row2-columnname2-columnvalue2, but
some times, it becomes row1-columnname1-columnvalue2,row2-columnname2-columnvalue1. Notice
the wrong column value.
> 
> 
> ------------------ Original ------------------
> From:  "Richard Low"<rlow@acunu.com<mailto:rlow@acunu.com>>;
> Date:  Tue, Dec 11, 2012 07:44 PM
> To:  "user"<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>;
> Subject:  Re: Why Secondary indexes is so slowly by my test?
> 
> Hi,
> 
> Secondary index lookups are more complicated than normal queries so will be slower. Items
have to first be queried in the index, then retrieved from their actual location. Also, inserting
into indexed CFs will be slower (but will get substantially faster in 1.2 due to CASSANDRA-2897).
> 
> If you need to retrieve large amounts of data with your query, you would be better off
changing your data model to not use secondary indexes.
> 
> Richard.
> 
> 
> On 7 December 2012 03:08, Chengying Fang <cyfang@ngnsoft.com<mailto:cyfang@ngnsoft.com>>
wrote:
> Hi guys,
> 
> I found Secondary indexes too slowly in my product(amazon large instance) with cassandra,
then I did test again as describe here. But the result is the same as product. What's wrong
with cassandra or me?
> Now my test:
> newly installed ubuntu-12.04 LTS , apache-cassandra-1.1.6, default configure, just one
keyspace(test) and one CF(TestIndex):
> 
>  1.  CREATECOLUMN FAMILY TestIndex
>  2.  WITH comparator = UTF8Type
>  3.  AND key_validation_class=UTF8Type
>  4.  AND default_validation_class = UTF8Type
>  5.  AND column_metadata = [
>  6.  {column_name: tk, validation_class: UTF8Type, index_type: KEYS}
>  7.  {column_name: from, validation_class: UTF8Type}
>  8.  {column_name: to, validation_class: UTF8Type}
>  9.  {column_name: tm, validation_class: UTF8Type}
>  10. ];
> 
> and 'tk' just three value:'A'(1000row),'B'(1000row),'X'(increment by test)
> The test query from cql:
> 1,without index:selectcount(*) from TestIndex limit 1000000;
> 2,with index:selectcount(*) from TestIndex where tk='X' limit 1000000;
> When I insert 60000 row 'X', the time:1s and 12s.
> When 'X' up to 130000,the time:2.3s and 33s.
> When 'X' up to 250000,the time:3.8s and 53s.
> 
> According to this, when 'X' up to billon, what's the result? Can Secondary indexes be
used in product? I hope it's my mistake in doing this test.Can anyone give some tips about
it?
> Thanks in advance.
> fancy
> 
> 
> 
> --
> Richard Low
> Acunu | http://www.acunu.com | @acunu
> 


Mime
View raw message