incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Helsen (JIRA)" <>
Subject [jira] Commented: (JENA-29) cancellation during query execution
Date Tue, 15 Feb 2011 14:43:57 GMT


Simon Helsen commented on JENA-29:

Andy, a few things

1) the difference between QueryIter1 and QueryIter2 is just following the pattern for closing.
2) ok, if abort is identical to close except for trying to suppress exceptions, it can probably
be removed since we have to catch exceptions surrounding the abort as well as the actual execution
3) I cannot share my tests since they are tied to our own framework which wraps around jena.
They rely on our own monitoring thread and I don't know how I could move that without essentially
writing new tests. The key is that in these tests, I can manually verify that partial results
come back and that the timeouts are more or less observed. So, I have a query which runs,
say 10s, but times out after 1500ms and produces, say 5, instead of 100 results. 
4) QueryIterAbortCancellationRequestException: this exception is thrown whenever there is
an embedded iterator which was cancelled (notably the sort). If I do not "abort" the "cancel",
I would still only see at most 1 result instead of all results which the embedded iterator
found. If you have a better idea on how to handle this, be my guest, but I was not able to
get more than one result when a sorting query was cancelled in the middle 

> cancellation during query execution
> -----------------------------------
>                 Key: JENA-29
>                 URL:
>             Project: Jena
>          Issue Type: Improvement
>          Components: ARQ, TDB
>            Reporter: Simon Helsen
>            Assignee: Andy Seaborne
>         Attachments: JENA-29_ARQ_r8489.patch, JENA-29_TDB_r8489.patch, JENA-29_tests_ARQ_r8489.patch,
jena.patch, jenaAddition.patch
> The requested improvement and proposed patch is made by Simon Helsen on behalf of IBM
> ARQ query execution currently does not have a satisfactory way to cancel a running query
in a safe way. Moreover, cancel (unlike a hard abort) is especially useful if it is able to
provide partial result sets (i.e. all the results it managed to compute up to when the cancellation
was requested). Although the exact cancellation behavior depends on the capabilities of the
underlying triple store, the proposed patch merely relies on the iterators used by ARQ.
> Here is a more detailed explanation of the proposed changes:
> 1) the cancel() method in the QueryIterator initiates a cancellation request (first boolean
flag). In analogy with closeIterator(), it propagates through all chained iterators, so the
entire calculation is aware that a cancellation is requested
> 2) to ensure a thread-safe semantics, the cancelRequest becomes a real cancel once nextBinding()
has been called. It sets the second boolean which is used in hasNext(). This 2-phase approach
is critical since the cancel() method can be called at any time during a query execution by
the external thread. And because the behavior of hasNext() is such that it has to return the
*same* value until next() is called, this is the only way to guarantee semantic safety when
cancel() is invoked (let me re-phrase this: it is the only way I was able to make it actually
> 3) cancel() does not close anything since it allows execution to finish normally and
the client is responsible to call close() just like with a regular execution. Note that the
client has to call cancel() explicitly (typically in another thread) and has to assume that
the returning result set may be incomplete if this method is called (it is undetermined whether
the result is _actually_ incomplete)
> 4) in order to deal with order-by and groups, I had to make two more changes. First,
I had to make QueryIterSort and QueryIterGroup a slightly bit more lazy. Currently, the full
result set is calculated during plan calculation. With my proposed adjustments, this full
result set is called on the first call to any of its Iterator methods (e.g. hasNext). This
change does not AFAIK affect the semantics. Second, because the desired behavior of cancelling
a sort or group query is to make sure everything is sorted/grouped even if the total result
set is not completed, I added an exception which reverses the cancellation request of the
encompassing iterator (as an example see cancel() in QueryIterSort). This makes sure that
the entire subset of found and sorted elements is returned, not just the first element. However,
it also implies in the case of sort that when a query is cancelled, it will first sort the
partially complete result set before returning to the client.
> the attached patch is based on ARQ 2.8.5 (and a few classes in TDB 0.8.7 -> possibly
the other triple store implementations need adjustement as well)

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message