cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oded Peer (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-4476) Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)
Date Tue, 02 Dec 2014 23:48:15 GMT


Oded Peer updated CASSANDRA-4476:
    Attachment: 4476-5.patch


Added 4476-5.patch

I updated the code to scan a restricted range if there are two expressions on the same column.

{quote}I think that you should use isRange or isSlice instead of isRelationalOrderOperator
as it is clearer.{quote}
I renamed the method.

{quote}The name of test class: SecondaryIndexNonEqTest is misleading. CONTAINS an CONTAINS
KEY operator are also non eq tests.{quote}
I renamed the test.

{quote}In getRelationalOrderEstimatedSize I do not understand why you do not return 0 if estimatedKeysForRange
return 0. Could you explain?{quote}
I added comments to the code since I think it should documented in the code and not in Jira.
I hope it is understandable.

{quote}Instead of doing some dangerous casting in getRelationalOrderEstimatedSize, you should
change the type from bestMeanCount from int to long.{quote}
I changed the type of bestMeanCount

{quote}In computeNext I do not understand why you do not check for stale data for range queries?
Could you explain?{quote}
I added comments to the code.

{quote}I think it would be nicer to have also an iterator for EQ and use polymorphism instead
of if else.{quote}
Generally I agree polymorphism is good practice, however I think in this case it would make
the code less readable.

{quote}The close method of the AbstractScanIterator returned by getSequentialIterator should
be called from the close method.{quote}
Thanks. I wasn't aware the iterator is closeable, they usually aren't.

{quote}The Unit tests are only covering a subset of the possible queries. Could you add more
(a > 3 and a <4, a < 3 and a > 4 ...){quote}

{quote}When testing for InvalidRequestException you should use assertInvalidMessage{quote}
Thanks, I wanted to use such a method but couldn't find it on my own.


I don't understand the problem in your example. The query result seems valid to me.
In addition, can you please explain how a query using only secondary indexes such as {{select
k from my_table where index1 = 5 and index2 > 10 allow filtering}} retains token order?

> Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE)
> ----------------------------------------------------------------------------
>                 Key: CASSANDRA-4476
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Sylvain Lebresne
>            Assignee: Oded Peer
>            Priority: Minor
>              Labels: cql
>             Fix For: 3.0
>         Attachments: 4476-2.patch, 4476-3.patch, 4476-5.patch, cassandra-trunk-4476.patch
> Currently, a query that uses 2ndary indexes must have at least one EQ clause (on an indexed
column). Given that indexed CFs are local (and use LocalPartitioner that order the row by
the type of the indexed column), we should extend 2ndary indexes to allow querying indexed
columns even when no EQ clause is provided.
> As far as I can tell, the main problem to solve for this is to update KeysSearcher.highestSelectivityPredicate().
I.e. how do we estimate the selectivity of non-EQ clauses? I note however that if we can do
that estimate reasonably accurately, this might provide better performance even for index
queries that both EQ and non-EQ clauses, because some non-EQ clauses may have a much better
selectivity than EQ ones (say you index both the user country and birth date, for SELECT *
FROM users WHERE country = 'US' AND birthdate > 'Jan 2009' AND birtdate < 'July 2009',
you'd better use the birthdate index first).

This message was sent by Atlassian JIRA

View raw message