Return-Path: Delivered-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Received: (qmail 74391 invoked from network); 17 Feb 2011 14:00:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Feb 2011 14:00:57 -0000 Received: (qmail 41649 invoked by uid 500); 17 Feb 2011 14:00:57 -0000 Delivered-To: apmail-incubator-jena-dev-archive@incubator.apache.org Received: (qmail 41631 invoked by uid 500); 17 Feb 2011 14:00:56 -0000 Mailing-List: contact jena-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jena-dev@incubator.apache.org Delivered-To: mailing list jena-dev@incubator.apache.org Received: (qmail 41623 invoked by uid 99); 17 Feb 2011 14:00:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Feb 2011 14:00:56 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Feb 2011 14:00:53 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 3A8771AA421 for ; Thu, 17 Feb 2011 14:00:32 +0000 (UTC) Date: Thu, 17 Feb 2011 14:00:32 +0000 (UTC) From: "Andy Seaborne (JIRA)" To: jena-dev@incubator.apache.org Message-ID: <484292404.2757.1297951232235.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <28749399.327621294873666020.JavaMail.jira@thor> Subject: [jira] Commented: (JENA-29) cancellation during query execution MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JENA-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995811#comment-12995811 ] Andy Seaborne commented on JENA-29: ----------------------------------- This is an outline of the contract of "cancel". It is not a description of implemenation. Currently, QueryExecutionBase.cancel() exists (and is deprecated with the comment "do not use") for testing all this. It will be moved to .abort() when we're ready. A phase of renaming internal methods may happen when the details of implementation make the exact nature of methods and fields quite clear. 1/ To terminate an execution, something calls .cancel, on the QueryExecution which in turn calls ".cancelRequest()". Multiple calls of .cancel result in one call of cancelRequest(). cancelRequest is async to iterator execution. 2/ Internal "cancellation" is also possible i.e. the system chooses to call .cancel itself (e.g. timeout, if done that way, or limit on total number of resutls [not planned], other mysteries). 2/ Cancellation is not required to happen immediately, or indeed to happen at all, but would probably be considered a bad implementation not to do something. That is, we don't enforce-by-contract that cancel has a specific effect at any specific point. 3/ When cancelled, .hasNext/.next on the results iterator are undefined. CONSTRUCT and DESCRIBE will return null. 4/ There are some internal flags to control the behaviour after cancellation is active. Default behaviour: A/ Any calls to .hasNext()/.next() throw QueryTerminatedException. They start doing so at some point (cancellation is async to execution) but the intention is as soon as reasonably possible. I read the javadoc for the .hasNext contract as meaning is hasNext is true, then noSuchElementException will not be thrown. You can get ConcurrentModificationException from java collections from .next() anyway regardless of .hasNext(). Continuation behaviour: B/ The QueryIterator is closed during the next call to .next(). An element is returned. The iterator is not explicitly closed by QueryIteratorBase if NoSuchElementException is thrown. Further calls to .hasNext/.next may return results. (a little tighter would be to stop if .hasNext has not been called yet for the next solution - needs another flag for "is hasNext() already decided"). > cancellation during query execution > ----------------------------------- > > Key: JENA-29 > URL: https://issues.apache.org/jira/browse/JENA-29 > Project: Jena > Issue Type: Improvement > Components: ARQ, TDB > Reporter: Simon Helsen > Assignee: Andy Seaborne > Attachments: JENA-29_ARQ_r8489.patch, JENA-29_TDB_r8489.patch, JENA-29_tests_ARQ_r8489.patch, jena.patch, jenaAddition.patch, queryIterRepeatApply.patch > > > The requested improvement and proposed patch is made by Simon Helsen on behalf of IBM > ARQ query execution currently does not have a satisfactory way to cancel a running query in a safe way. Moreover, cancel (unlike a hard abort) is especially useful if it is able to provide partial result sets (i.e. all the results it managed to compute up to when the cancellation was requested). Although the exact cancellation behavior depends on the capabilities of the underlying triple store, the proposed patch merely relies on the iterators used by ARQ. > Here is a more detailed explanation of the proposed changes: > 1) the cancel() method in the QueryIterator initiates a cancellation request (first boolean flag). In analogy with closeIterator(), it propagates through all chained iterators, so the entire calculation is aware that a cancellation is requested > 2) to ensure a thread-safe semantics, the cancelRequest becomes a real cancel once nextBinding() has been called. It sets the second boolean which is used in hasNext(). This 2-phase approach is critical since the cancel() method can be called at any time during a query execution by the external thread. And because the behavior of hasNext() is such that it has to return the *same* value until next() is called, this is the only way to guarantee semantic safety when cancel() is invoked (let me re-phrase this: it is the only way I was able to make it actually work) > 3) cancel() does not close anything since it allows execution to finish normally and the client is responsible to call close() just like with a regular execution. Note that the client has to call cancel() explicitly (typically in another thread) and has to assume that the returning result set may be incomplete if this method is called (it is undetermined whether the result is _actually_ incomplete) > 4) in order to deal with order-by and groups, I had to make two more changes. First, I had to make QueryIterSort and QueryIterGroup a slightly bit more lazy. Currently, the full result set is calculated during plan calculation. With my proposed adjustments, this full result set is called on the first call to any of its Iterator methods (e.g. hasNext). This change does not AFAIK affect the semantics. Second, because the desired behavior of cancelling a sort or group query is to make sure everything is sorted/grouped even if the total result set is not completed, I added an exception which reverses the cancellation request of the encompassing iterator (as an example see cancel() in QueryIterSort). This makes sure that the entire subset of found and sorted elements is returned, not just the first element. However, it also implies in the case of sort that when a query is cancelled, it will first sort the partially complete result set before returning to the client. > the attached patch is based on ARQ 2.8.5 (and a few classes in TDB 0.8.7 -> possibly the other triple store implementations need adjustement as well) -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira