Many thx for the explanation Aaron.


On Wednesday, July 4, 2012, aaron morton wrote:
> select my_cf where columnA = a and columnB = b and columnC = c and columnD = d
Cassandra will only use one equality clause to select the candidate rows. The other clauses are applied to the rows using that first clause.

The clause to use to select candidate rows is based on statistics that estimate the number of columns in the indexes.

> Do you have any ideas? is there any way to understand how cassandra internally run the query (a kind of "explain plan")?
The only way I know of to see the "query plan" is to set DEBUG logging on org.apache.cassandra.db.index.keys.KeysSearcher and look for the message "Primary scan clause is "

Note, if this is a common query you may get better performance creating a custom secondary index than using four equality clauses in an index scan.

> 2/ Is there any limitations on the number of criterias we can usually have?

None that I know of. Query will probably run slower the more you have.

> 3/ Even if we have different data type (date, string, int), we have all stored them as UTF8Type. Could we expect performance improvements if we use DateType, LongType?
No. The main issue is going to be the selectivity of the primary scan clause, followed by the number of additional clauses. Their types will have very little / no impact.

Hope that helps.

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/07/2012, at 3:59 AM, Olivier Mallassi wrote:

> Hi all
>
> We have 4 indexed columns; all configured in UT8Type even if one columns is a date and the other an integer).
>
> 1/ the read query we run can have up to 4 criteria
> select my_cf where columnA = a and columnB = b and columnC = c and columnD = d
>
> This query, is fast (<500ms) up to 3 criterias but when we add the fourth one, the exection time is 9,5s.
> Do you have any ideas? is there any way to understand how cassandra internally run the query (a kind of "explain plan")?
>
> 2/ Is there any limitations on the number of criterias we can usually have?
>
> 3/ Even if we have different data type (date, string, int), we have all stored them as UTF8Type. Could we expect performance improvements if we use DateType, LongType?
>
> Many thx for all your answers.
>
> --
> ............................................................
> Olivier Mallassi
> OCTO Technology
> ............................................................
> 50, Avenue des Champs-Elysées
> 75008 Paris
>
> Mobile: (33) 6 28 70 26 61
> Tél: (33) 1 58 56 10 00
> Fax: (33) 1 58 56 10 01
>
> http://www.octo.com
> Octo Talks! http://blog.octo.com
>
>



--
............................................................
Olivier Mallassi
OCTO Technology
............................................................
50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com