Return-Path: X-Original-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D8EEC839E for ; Fri, 12 Aug 2011 13:41:49 +0000 (UTC) Received: (qmail 80641 invoked by uid 500); 12 Aug 2011 13:41:49 -0000 Delivered-To: apmail-incubator-jena-dev-archive@incubator.apache.org Received: (qmail 80616 invoked by uid 500); 12 Aug 2011 13:41:49 -0000 Mailing-List: contact jena-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jena-dev@incubator.apache.org Delivered-To: mailing list jena-dev@incubator.apache.org Received: (qmail 80608 invoked by uid 99); 12 Aug 2011 13:41:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2011 13:41:48 +0000 X-ASF-Spam-Status: No, hits=-2000.8 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2011 13:41:47 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 2CBD2B9D02 for ; Fri, 12 Aug 2011 13:41:27 +0000 (UTC) Date: Fri, 12 Aug 2011 13:41:27 +0000 (UTC) From: "Andy Seaborne (JIRA)" To: jena-dev@incubator.apache.org Message-ID: <1611089758.33221.1313156487179.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1995692762.15990.1312792707151.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (JENA-89) Avoid a total sort for ORDER BY + LIMIT queries MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/JENA-89?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-89: ------------------------------ Comment: was deleted (was: OpBuilder was broken in another was as well: "Op sub = build(list, 3) ;" now fixed. ) > Avoid a total sort for ORDER BY + LIMIT queries > ----------------------------------------------- > > Key: JENA-89 > URL: https://issues.apache.org/jira/browse/JENA-89 > Project: Jena > Issue Type: Improvement > Components: ARQ > Reporter: Paolo Castagna > Assignee: Paolo Castagna > Priority: Minor > Labels: arq, optimization, scalability, sparql > Attachments: ARQ_JENA-89_r1156212.patch > > > In case of SPARQL queries with ORDER BY + LIMIT, ARQ sorts the entire result set and then produces the first N, according to the specified LIMIT. > As an alternative, discussed on jena-dev [1], we can use a PriorityQueue [2] (which is based on a priority heap) to avoid a sort operation. > ARQ's algebra package contains already a OpTopN [3] operator. The OpExecutor [4] will need to use a new QueryIterTopN instead of QueryIterSort + QueryIterSlice. A new TransformOrderByLimit to be used by Optimize is also necessary. > ORDER BY + LIMIT queries are typically used to construct the first page when results are paginated. Then the following query is ORDER BY + OFFSET + LIMIT. (Often users stop at the first page). Ideally, we could cache the ORDER BY and implement the OFFSET|LIMIT using results from the cache. However, the improvement described by this issue is limited to the ORDER BY + LIMIT case for which a priority heap is a good enough solution. > Hopefully, this would improve the scalability of ORDER BY + LIMIT queries in case of small values specified on the LIMIT. > [1] http://markmail.org/thread/5d2gtazkoxsa2ayv > [2] http://download.oracle.com/javase/6/docs/api/java/util/PriorityQueue.html > [3] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/com/hp/hpl/jena/sparql/algebra/op/OpTopN.java > [4] https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/com/hp/hpl/jena/sparql/engine/main/OpExecutor.java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira