Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C19C10ED2 for ; Fri, 28 Nov 2014 12:50:15 +0000 (UTC) Received: (qmail 22631 invoked by uid 500); 28 Nov 2014 12:50:14 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 22590 invoked by uid 500); 28 Nov 2014 12:50:14 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 22578 invoked by uid 99); 28 Nov 2014 12:50:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Nov 2014 12:50:14 +0000 Date: Fri, 28 Nov 2014 12:50:14 +0000 (UTC) From: "Benjamin Lerer (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-4476) Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228264#comment-14228264 ] Benjamin Lerer commented on CASSANDRA-4476: ------------------------------------------- Here are my feedbacks: * Be carefull with whitespaces and indentation I had to use --ignore-space-change and --ignore-whitespace to be able to apply your patch. I had 82 lines with whitespace errors. * {{SecondaryIndex.supportOperator}} is overloaded. Secondary indices also support {{contains}} and {{contains key}} operators. You ignored them completely and broke that part of the code as you will see if you run {{ContainsRelationTest}}. * The goal of {{SecondaryIndexSearcher.highestSelectivityPredicate}} is to determine which index will be selecting the smallest amount of rows (highest selectivity). There is no reason why an equal operator should select less row than a slice operator. * Index expressions should be grouped when multiple slices apply to the same column. If a user does the following query Select * from myTable where a > 1 and a < 3 you should only scan from 1 to 3 and not from 1 to infinity or from -infinity to 3. * I have some trouble to understand the changes that you made in {{CompositesSearcher}}. Could you add some comments to explain your approach? * You should use meaningfull name for the test methods. Naming them {{bug4476}} force the reader to go to JIRA to have some clue about what the method is actualy testing. * Using {{pageSize}} as an instance variable in {{CQLTest}} is dangerous as it can have some unwanted effect on the other test methods (specially as JUnit does not guarantee the method execution order since Java 7). > Support 2ndary index queries with only inequality clauses (LT, LTE, GT, GTE) > ---------------------------------------------------------------------------- > > Key: CASSANDRA-4476 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4476 > Project: Cassandra > Issue Type: Improvement > Components: API, Core > Reporter: Sylvain Lebresne > Assignee: Oded Peer > Priority: Minor > Labels: cql > Fix For: 3.0 > > Attachments: 4476-2.patch, cassandra-trunk-4476.patch > > > Currently, a query that uses 2ndary indexes must have at least one EQ clause (on an indexed column). Given that indexed CFs are local (and use LocalPartitioner that order the row by the type of the indexed column), we should extend 2ndary indexes to allow querying indexed columns even when no EQ clause is provided. > As far as I can tell, the main problem to solve for this is to update KeysSearcher.highestSelectivityPredicate(). I.e. how do we estimate the selectivity of non-EQ clauses? I note however that if we can do that estimate reasonably accurately, this might provide better performance even for index queries that both EQ and non-EQ clauses, because some non-EQ clauses may have a much better selectivity than EQ ones (say you index both the user country and birth date, for SELECT * FROM users WHERE country = 'US' AND birthdate > 'Jan 2009' AND birtdate < 'July 2009', you'd better use the birthdate index first). -- This message was sent by Atlassian JIRA (v6.3.4#6332)