Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4772F1724A for ; Mon, 13 Apr 2015 23:49:15 +0000 (UTC) Received: (qmail 2539 invoked by uid 500); 13 Apr 2015 23:49:15 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 2497 invoked by uid 500); 13 Apr 2015 23:49:15 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 2485 invoked by uid 99); 13 Apr 2015 23:49:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Apr 2015 23:49:15 +0000 Date: Mon, 13 Apr 2015 23:49:15 +0000 (UTC) From: "Aleksey Yeschenko (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-8717) Top-k queries with custom secondary indexes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8717?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 14493297#comment-14493297 ]=20 Aleksey Yeschenko commented on CASSANDRA-8717: ---------------------------------------------- I'll review shortly (we have a conference this week, so expect early next w= eek most likely). In the meantime, can you format the patch to match the project's code style= - https://wiki.apache.org/cassandra/CodeStyle ? Thanks > Top-k queries with custom secondary indexes > ------------------------------------------- > > Key: CASSANDRA-8717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8717 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Andr=C3=A9s de la Pe=C3=B1a > Assignee: Andr=C3=A9s de la Pe=C3=B1a > Priority: Minor > Labels: 2i, secondary_index, sort, sorting, top-k > Fix For: 3.0 > > Attachments: 0001-Add-support-for-top-k-queries-in-2i.patch, 0002= -Add-support-for-top-k-queries-in-2i.patch > > > As presented in [Cassandra Summit Europe 2014|https://www.youtube.com/wat= ch?v=3DHg5s-hXy_-M], secondary indexes can be modified to support general t= op-k queries with minimum changes in Cassandra codebase. This way, custom 2= i implementations could provide relevance search, sorting by columns, etc. > Top-k queries retrieve the k best results for a certain query. That impli= es querying the k best rows in each token range and then sort them in order= to obtain the k globally best rows.=20 > For doing that, we propose two additional methods in class SecondaryIndex= Searcher: > {code:java} > public boolean requiresFullScan(List clause) > { > return false; > } > public List sort(List clause, List rows) > { > return rows; > } > {code} > The first one indicates if a query performed in the index requires queryi= ng all the nodes in the ring. It is necessary in top-k queries because we d= o not know which node are the best results. The second method specifies how= to sort all the partial node results according to the query.=20 > Then we add two similar methods to the class AbstractRangeCommand: > {code:java} > this.searcher =3D Keyspace.open(keyspace).getColumnFamilyStore(column= Family).indexManager.searcher(rowFilter); > public boolean requiresFullScan() { > return searcher =3D=3D null ? false : searcher.requiresFullScan(rowFi= lter); > } > public List combine(List rows) > { > return searcher =3D=3D null ? trim(rows) : trim(searcher.sort(rowFilt= er, rows)); > } > {code} > Finnally, we modify StorageProxy#getRangeSlice to use the previous method= , as shown in the attached patch. > We think that the proposed approach provides very useful functionality wi= th minimum impact in current codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)