cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-4476) Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)
Date Fri, 28 Nov 2014 12:50:14 GMT


Benjamin Lerer commented on CASSANDRA-4476:

Here are my feedbacks:
* Be carefull with whitespaces and indentation I had to use --ignore-space-change and --ignore-whitespace
to be able to apply your patch. I had 82 lines with whitespace errors.
* {{SecondaryIndex.supportOperator}} is overloaded. Secondary indices also support {{contains}}
and {{contains key}} operators.
You ignored them completely and broke that part of the code as you will see if you run {{ContainsRelationTest}}.
* The goal of {{SecondaryIndexSearcher.highestSelectivityPredicate}} is to determine which
index will be selecting the smallest amount of rows (highest selectivity). There is no reason
why an equal operator should select less row than a slice operator.
* Index expressions should be grouped when multiple slices apply to the same column. If a
user does the following query Select * from myTable where a > 1 and a < 3 you should
only scan from 1 to 3 and not from 1 to infinity or from -infinity to 3.
* I have some trouble to understand the changes that you made in  {{CompositesSearcher}}.
Could you add some comments to explain your approach?
* You should use meaningfull name for the test methods. Naming them {{bug4476}} force the
reader to go to JIRA to have some clue about what the method is actualy testing.
* Using {{pageSize}} as an instance variable in {{CQLTest}} is dangerous as it can have some
unwanted effect on the other test methods (specially as JUnit does not guarantee the method
execution order since Java 7).

> Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)
> ----------------------------------------------------------------------------
>                 Key: CASSANDRA-4476
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Sylvain Lebresne
>            Assignee: Oded Peer
>            Priority: Minor
>              Labels: cql
>             Fix For: 3.0
>         Attachments: 4476-2.patch, cassandra-trunk-4476.patch
> Currently, a query that uses 2ndary indexes must have at least one EQ clause (on an indexed
column). Given that indexed CFs are local (and use LocalPartitioner that order the row by
the type of the indexed column), we should extend 2ndary indexes to allow querying indexed
columns even when no EQ clause is provided.
> As far as I can tell, the main problem to solve for this is to update KeysSearcher.highestSelectivityPredicate().
I.e. how do we estimate the selectivity of non-EQ clauses? I note however that if we can do
that estimate reasonably accurately, this might provide better performance even for index
queries that both EQ and non-EQ clauses, because some non-EQ clauses may have a much better
selectivity than EQ ones (say you index both the user country and birth date, for SELECT *
FROM users WHERE country = 'US' AND birthdate > 'Jan 2009' AND birtdate < 'July 2009',
you'd better use the birthdate index first).

This message was sent by Atlassian JIRA

View raw message