cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wojciech Pietrzok <kosci...@gmail.com>
Subject Re: Secondary indexes performance
Date Wed, 22 Jun 2011 11:52:32 GMT
OK, got some results (below).
2 nodes, one on localhost, second on LAN, reading with
ConsistencyLevel.ONE, buffer_size=512 rows (that's how many rows
pycassa will get on one connection, than it will use last row_id as
start row for next query)

Queries types:
1) get_range - just added limit of 1024 rows
2) get_indexed_slices ASCII - one term: on indexed column with ASCII type
3) get_indexed_slices INT - one term: on indexed column with INT type
4) get_indexed_slices ASCII  + GTE, LTE on indexed INT - three terms:
on indexed column with INT type + LTE, GTE on indexed column with INT
type
5) get_indexed_slices 2 terms, ASCII - two terms, both columns
indexed, with ASCII type
6) get_indexed_slices ASCII + GTE, LTE on non indexed INT - like 4)
but LTE, GTE are on non-indexed column

3 runs for each set of queries, on successive runs times were better.
Times are in seconds


But if you say that 1024 rows is reasonably big slice (not mentioning
over 10k rows) it will probably be better to denormalize and store
some precomputed data


Results:

# Run 1
PERF: [a] get_range: 0.58[s]
PERF: [a] get_indexed_slices ASCII: 3.96[s]
PERF: [a] get_indexed_slices INT: 1.82[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT: 1.31[s] #
314 returned
PERF: [cr] get_indexed_slices ASCII: 1.13[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 8.69[s]

# Run 2, same queries
PERF: [a] get_range: 0.33[s]
PERF: [a] get_indexed_slices ASCII: 0.36[s]
PERF: [a] get_indexed_slices INT: 5.39[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 5.42[s] #
314 returned
PERF: [cr] get_indexed_slices ASCII: 0.55[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 3.57[s]

# Run 3, same queries
PERF: [a] get_range: 0.18[s]
PERF: [a] get_indexed_slices ASCII: 0.39[s]
PERF: [a] get_indexed_slices INT: 0.83[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 0.85[s] #
314 returned
PERF: [cr] get_indexed_slices ASCII: 0.39[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 3.36[s]

# changed some terms, so always 1024 returned are returned
# Run 1
PERF: [a] get_range: 0.31[s]
PERF: [a] get_indexed_slices ASCII: 3.14[s]
PERF: [a] get_indexed_slices INT: 0.70[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 4.72[s]
PERF: [cr] get_indexed_slices ASCII: 0.73[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.85[s]
PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.17[s]

# Run 2, same queries
PERF: [a] get_range: 0.20[s]
PERF: [a] get_indexed_slices ASCII: 0.60[s]
PERF: [a] get_indexed_slices INT: 1.22[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 1.27[s]
PERF: [cr] get_indexed_slices ASCII: 0.48[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.50[s]
PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.22[s]

# Run 3, same queries
PERF: [a] get_range: 0.25[s]
PERF: [a] get_indexed_slices ASCII: 0.44[s]
PERF: [a] get_indexed_slices INT: 0.89[s]
PERF: [a] get_indexed_slices INT + GTE, LTE on indexed INT : 6.58[s]
PERF: [cr] get_indexed_slices ASCII: 1.18[s]
PERF: [cr] get_indexed_slices 2 terms, ASCII: 0.50[s]
PERF: [cr] get_indexed_slices ASCII + GTE, LTE on non indexed INT : 2.09[s]




2011/6/21 aaron morton <aaron@thelastpickle.com>:
> Can you provide some more information on the query you are running ? How many terms are
you selecting with?
>
> How long does it take to return 1024 rows ? IMHO thats a reasonably big slice to get.
>
> The server will pick the most selective equality predicate, and then filter the results
from that using the other predicates.
>
> Cheers


-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 KosciaK     mail: kosciak1@gmail.com
                   www : http://kosciak.net/
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Mime
View raw message