Return-Path: X-Original-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B89088885 for ; Mon, 8 Aug 2011 08:39:02 +0000 (UTC) Received: (qmail 30206 invoked by uid 500); 8 Aug 2011 08:39:01 -0000 Delivered-To: apmail-incubator-jena-dev-archive@incubator.apache.org Received: (qmail 30138 invoked by uid 500); 8 Aug 2011 08:38:54 -0000 Mailing-List: contact jena-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jena-dev@incubator.apache.org Delivered-To: mailing list jena-dev@incubator.apache.org Received: (qmail 30121 invoked by uid 99); 8 Aug 2011 08:38:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 08:38:51 +0000 X-ASF-Spam-Status: No, hits=-2000.8 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 08:38:48 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 25E10B1E4E for ; Mon, 8 Aug 2011 08:38:27 +0000 (UTC) Date: Mon, 8 Aug 2011 08:38:27 +0000 (UTC) From: "Paolo Castagna (JIRA)" To: jena-dev@incubator.apache.org Message-ID: <1995692762.15990.1312792707151.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (JENA-89) Avoid a total sort for ORDER BY + LIMIT queries MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org Avoid a total sort for ORDER BY + LIMIT queries ----------------------------------------------- Key: JENA-89 URL: https://issues.apache.org/jira/browse/JENA-89 Project: Jena Issue Type: Improvement Components: ARQ Reporter: Paolo Castagna Assignee: Paolo Castagna Priority: Minor In case of SPARQL queries with ORDER BY + LIMIT, ARQ sorts the entire result set and then produces the first N, according to the specified LIMIT. As an alternative, we can use a [PriorityQueue|http://download.oracle.com/javase/6/docs/api/java/util/PriorityQueue.html] (which is based on a priority heap) to avoid a sort operation. ARQ's algebra package contains already a [OpTopN|https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/com/hp/hpl/jena/sparql/algebra/op/OpTopN.java] operator. The [OpExecutor|https://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/com/hp/hpl/jena/sparql/engine/main/OpExecutor.java] will need to use a new QueryIterTopN instead of QueryIterSort + QueryIterSlice. A new TransformOrderByLimit to be used by Optimize is also necessary. ORDER BY + LIMIT queries are typically used to construct the first page when results are paginated. Then the following query is ORDER BY + OFFSET + LIMIT. (Often users stop at the first page). Ideally, we could cache the ORDER BY and implement the OFFSET|LIMIT using results from the cache. However, the improvement described by this issue is limited to the ORDER BY + LIMIT case for which a priority heap is a good enough solution. Hopefully, this would improve the scalability of ORDER BY + LIMIT queries in case of small values specified on the LIMIT. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira