cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Secondary indexes performance
Date Wed, 22 Jun 2011 22:36:14 GMT
> it will probably be better to denormalize and store
> some precomputed data

Yes, if you know there are queries you need to serve it is better to support those directly
in the data model. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jun 2011, at 23:52, Wojciech Pietrzok wrote:

> OK, got some results (below).
> 2 nodes, one on localhost, second on LAN, reading with
> ConsistencyLevel.ONE, buffer_size=512 rows (that's how many rows
> pycassa will get on one connection, than it will use last row_id as
> start row for next query)
> 
> Queries types:
> 1) get_range - just added limit of 1024 rows
> 2) get_indexed_slices ASCII - one term: on indexed column with ASCII type
> 3) get_indexed_slices INT - one term: on indexed column with INT type
> 4) get_indexed_slices ASCII  + GTE, LTE on indexed INT - three terms:
> on indexed column with INT type + LTE, GTE on indexed column with INT
> type
> 5) get_indexed_slices 2 terms, ASCII - two terms, both columns
> indexed, with ASCII type
> 6) get_indexed_slices ASCII + GTE, LTE on non indexed INT - like 4)
> but LTE, GTE are on non-indexed column
> 
> 3 runs for each set of queries, on successive runs times were better.
> Times are in seconds
> 
> 
> But if you say that 1024 rows is reasonably big slice (not mentioning
> over 10k rows) it will probably be better to denormalize and store
> some precomputed data
> 
> 
> Results:
> 
> # Run 1
> PERF: [a] get_range: 0.58[s]
> PERF: [a] get_indexed_slices ASCII: 3.96[s]
> PERF: [a] get_indexed_slices INT: 1.82[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT: 1.31[s] #
> 314 returned
> PERF: [cr] get_indexed_slices ASCII: 1.13[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 8.69[s]
> 
> # Run 2, same queries
> PERF: [a] get_range: 0.33[s]
> PERF: [a] get_indexed_slices ASCII: 0.36[s]
> PERF: [a] get_indexed_slices INT: 5.39[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 5.42[s] #
> 314 returned
> PERF: [cr] get_indexed_slices ASCII: 0.55[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 3.57[s]
> 
> # Run 3, same queries
> PERF: [a] get_range: 0.18[s]
> PERF: [a] get_indexed_slices ASCII: 0.39[s]
> PERF: [a] get_indexed_slices INT: 0.83[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 0.85[s] #
> 314 returned
> PERF: [cr] get_indexed_slices ASCII: 0.39[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 3.36[s]
> 
> # changed some terms, so always 1024 returned are returned
> # Run 1
> PERF: [a] get_range: 0.31[s]
> PERF: [a] get_indexed_slices ASCII: 3.14[s]
> PERF: [a] get_indexed_slices INT: 0.70[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 4.72[s]
> PERF: [cr] get_indexed_slices ASCII: 0.73[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.85[s]
> PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.17[s]
> 
> # Run 2, same queries
> PERF: [a] get_range: 0.20[s]
> PERF: [a] get_indexed_slices ASCII: 0.60[s]
> PERF: [a] get_indexed_slices INT: 1.22[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 1.27[s]
> PERF: [cr] get_indexed_slices ASCII: 0.48[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.50[s]
> PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.22[s]
> 
> # Run 3, same queries
> PERF: [a] get_range: 0.25[s]
> PERF: [a] get_indexed_slices ASCII: 0.44[s]
> PERF: [a] get_indexed_slices INT: 0.89[s]
> PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 6.58[s]
> PERF: [cr] get_indexed_slices ASCII: 1.18[s]
> PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.50[s]
> PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.09[s]
> 
> 
> 
> 
> 2011/6/21 aaron morton <aaron@thelastpickle.com>:
>> Can you provide some more information on the query you are running ? How many terms
are you selecting with?
>> 
>> How long does it take to return 1024 rows ? IMHO thats a reasonably big slice to
get.
>> 
>> The server will pick the most selective equality predicate, and then filter the results
from that using the other predicates.
>> 
>> Cheers
> 
> 
> -- 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>  KosciaK     mail: kosciak1@gmail.com
>                    www : http://kosciak.net/
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Mime
View raw message