Return-Path: X-Original-To: apmail-incubator-jena-commits-archive@minotaur.apache.org Delivered-To: apmail-incubator-jena-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 184129660 for ; Tue, 24 Jan 2012 23:40:10 +0000 (UTC) Received: (qmail 50863 invoked by uid 500); 24 Jan 2012 23:40:10 -0000 Delivered-To: apmail-incubator-jena-commits-archive@incubator.apache.org Received: (qmail 50836 invoked by uid 500); 24 Jan 2012 23:40:09 -0000 Mailing-List: contact jena-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jena-dev@incubator.apache.org Delivered-To: mailing list jena-commits@incubator.apache.org Received: (qmail 50829 invoked by uid 99); 24 Jan 2012 23:40:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jan 2012 23:40:09 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jan 2012 23:40:03 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id B191423889B8 for ; Tue, 24 Jan 2012 23:39:41 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r803222 - in /websites/staging/jena/trunk/content/jena/documentation/query: index.html manipulating_sparql_using_arq.html Date: Tue, 24 Jan 2012 23:39:41 -0000 To: jena-commits@incubator.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120124233941.B191423889B8@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: buildbot Date: Tue Jan 24 23:39:41 2012 New Revision: 803222 Log: Staging update by buildbot for jena Added: websites/staging/jena/trunk/content/jena/documentation/query/manipulating_sparql_using_arq.html Modified: websites/staging/jena/trunk/content/jena/documentation/query/index.html Modified: websites/staging/jena/trunk/content/jena/documentation/query/index.html ============================================================================== --- websites/staging/jena/trunk/content/jena/documentation/query/index.html (original) +++ websites/staging/jena/trunk/content/jena/documentation/query/index.html Tue Jan 24 23:39:41 2012 @@ -185,6 +185,7 @@ SPARQL is the query language developed b
  • Command line utilities
  • Logging
  • Explaining queries
  • +
  • Tutorial: manipulating SPARQL using ARQ
  • Advanced SPARQL use

    Features of ARQ that are legal SPARQL syntax.

    Added: websites/staging/jena/trunk/content/jena/documentation/query/manipulating_sparql_using_arq.html ============================================================================== --- websites/staging/jena/trunk/content/jena/documentation/query/manipulating_sparql_using_arq.html (added) +++ websites/staging/jena/trunk/content/jena/documentation/query/manipulating_sparql_using_arq.html Tue Jan 24 23:39:41 2012 @@ -0,0 +1,360 @@ + + + + + + + Apache Jena - Tutorial - Manipulating SPARQL using ARQ + + + + + + + + + + +
    +

    Tutorial - Manipulating SPARQL using ARQ

    +

    When you've been working with SPARQL you quickly find that static +queries are restrictive. Maybe you want to vary a value, perhaps add a +filter, alter the limit, etc etc. Being an impatient sort you dive in to +the query string, and it works. But what about little Bobby +Tables? And, even if you +sanitise your inputs, string manipulation is a fraught process and +syntax errors await you. Although it might seem harder than string +munging, the ARQ API is your friend in the long run.

    +

    Originally published on the Research Revealed project +blog

    +

    Inserting values (simple prepared statements)

    +

    Let's begin with something simple. Suppose we wanted to restrict the +following query to a particular person:

    +
       select * { ?person <http://xmlns.com/foaf/0.1/name> ?name }
    +
    + + +

    String#replaceAll would work, but there is a safer way. +QueryExecutionFactory in most cases lets you supply a QuerySolution +with which you can prebind values.

    +
       QuerySolutionMap initialBinding = new QuerySolutionMap();
    +   initialBinding.add("name", personResource);
    +   qe = QueryExecutionFactory.create(query, dataset, initialBinding);
    +
    + + +

    This is often much simpler than the string equivalent since you don't +have to escape quotes in literals. (Beware that this doesn't work for +sparqlService, which is a great shame. It would be nice to spend some +time remedying that.)

    +

    Making a Query from Scratch

    +

    The previously mentioned limitation is due to the fact that prebinding +doesn't actually change the query at all, but the execution of that +query. So what how do we really alter queries?

    +

    ARQ provides two ways to work with queries: at the syntax level (Query +and Element), or the algebra level (Op). The distinction is clear in +filters:

    +
       SELECT ?s { ?s <http://example.com/val> ?val . FILTER ( ?val < 20 ) }
    +
    + + +

    If you work at the syntax level you'll find that this looks (in pseudo +code) like:

    +
       (GROUP (PATTERN ( ?s <http://example.com/val> ?val )) (FILTER ( < ?val 20 ) ))
    +
    + + +

    That is there's a group containing a triple pattern and a filter, just +as you see in the query. The algebra is different, and we can see it +using arq.qparse --print op

    +
       $ java arq.qparse --print op 'SELECT ?s { ?s <http://example.com/val> ?val . FILTER ( ?val < 20 ) }'
    +   (base <file:///...>
    +       (project (?s)
    +           (filter (< ?val 20)
    +               (bgp (triple ?s <http://example.com/val> ?val)))))
    +
    + + +

    Here the filter contains the pattern, rather than sitting next to it. +This form makes it clear that the expression is filtering the pattern.

    +

    Let's create that query from scratch using ARQ. We begin with some +common pieces: the triple to match, and the expression for the filter.

    +
       // ?s ?p ?o .
    +   Triple pattern =
    +       Triple.create(Var.alloc("s"), Var.alloc("p"), Var.alloc("o"));
    +   // ( ?s < 20 )
    +   Expr e = new E_LessThan(new ExprVar("s"), new NodeValueInteger(20));
    +
    + + +

    Triple should be familiar from jena. Var is an extension of Node +for variables. Expr is the root interface for expressions, those +things that appear in FILTER and LET.

    +

    First the syntax route:

    +
       ElementTriplesBlock block = new ElementTriplesBlock(); // Make a BGP
    +   block.addTriple(pattern);                              // Add our pattern match
    +   ElementFilter filter = new ElementFilter(e);           // Make a filter matching the expression
    +   ElementGroup body = new ElementGroup();                // Group our pattern match and filter
    +   body.addElement(block);
    +   body.addElement(filter);
    +
    +   Query q = QueryFactory.make();
    +   q.setQueryPattern(body);                               // Set the body of the query to our group
    +   q.setQuerySelectType();                                // Make it a select query
    +   q.addResultVar("s");                                   // Select ?s
    +
    + + +

    Now the algebra:

    +
       Op op;
    +   BasicPattern pat = new BasicPattern();                 // Make a pattern
    +   pat.add(pattern);                                      // Add our pattern match
    +   op = new OpBGP(pat);                                   // Make a BGP from this pattern
    +   op = OpFilter.filter(e, op);                           // Filter that pattern with our expression
    +   op = new OpProject(op, Arrays.asList(Var.alloc("s"))); // Reduce to just ?s
    +   Query q = OpAsQuery.asQuery(op);                       // Convert to a query
    +   q.setQuerySelectType();                                // Make is a select query
    +
    + + +

    Notice that the query form (SELECT, CONSTRUCT, DESCRIBE, ASK) isn't +part of the algebra, and we have to set this in the query (although +SELECT is the default). FROM and FROM NAMED are similarly absent.

    + +

    You can also look around the algebra and syntax using visitors. Start by +extending OpVisitorBase (ElementVisitorBase) which stubs out the +interface so you can concentrate on the parts of interest, then walk +using OpWalker.walk(Op, OpVisitor) +(ElementWalker.walk(Element, ElementVisitor)). These work bottom up.

    +

    For some alterations, like manipulating triple matches in place, +visitors will do the trick. They provide a simple way to get to the +right parts of the query, and you can alter the pattern backing BGPs in +both the algebra and syntax. Mutation isn't consistently available, +however, so don't depend on it.

    +

    Transforming the Algebra

    +

    So far there is no obvious advantage in using the algebra. The real +power is visible in transformers, which allow you to reorganise an +algebra completely. ARQ makes extensive use of transformations to +simplify and optimise query execution.

    +

    In Research Revealed I wrote some code to take a number of constraints +and produce a query. There were a number of ways to do this, but one way +I found was to generate ops from each constraint and join the results:

    +
       for (Constraint con: cons) {
    +       op = OpJoin.create(op, consToOp(cons)); // join
    +   }
    +
    + + +

    The result was a perfectly correct mess, which is only barely readable +with just three conditions:

    +
       (join
    +       (join
    +           (filter (< ?o0 20) (bgp (triple ?s <urn:ex:prop0> ?o0)))
    +           (filter (< ?o1 20) (bgp (triple ?s <urn:ex:prop1> ?o1))))
    +       (filter (< ?o2 20) (bgp (triple ?s <urn:ex:prop2> ?o2))))
    +
    + + +

    Each of the constraints is a filter on a bgp. This can be made much more +readable by moving the filters out, and merging the triple patterns. We +can do this with the following Transform:

    +
       class QueryCleaner extends TransformBase
    +   {
    +       @Override
    +       public Op transform(OpJoin join, Op left, Op right) {
    +           // Bail if not of the right form
    +           if (!(left instanceof OpFilter && right instanceof OpFilter)) return join;
    +           OpFilter leftF = (OpFilter) left;
    +           OpFilter rightF = (OpFilter) right;
    +
    +           // Add all of the triple matches to the LHS BGP
    +           ((OpBGP) leftF.getSubOp()).getPattern().addAll(((OpBGP) rightF.getSubOp()).getPattern());
    +           // Add the RHS filter to the LHS
    +           leftF.getExprs().addAll(rightF.getExprs());
    +           return leftF;
    +       }
    +   }
    +   ...
    +   op = Transformer.transform(new QueryCleaner(), op); // clean query
    +
    + + +

    This looks for joins of the form:

    +
       (join
    +       (filter (exp1) (bgp1))
    +       (filter (exp2) (bgp2)))
    +
    + + +

    And replaces it with:

    +
       (filter (exp1 && exp2) (bgp1 && bgp2))
    +
    + + +

    As we go through the original query all joins are removed, and the +result is:

    +
       (filter (exprlist (< ?o0 20) (< ?o1 20) (< ?o2 20))
    +       (bgp
    +           (triple ?s <urn:ex:prop0> ?o0)
    +           (triple ?s <urn:ex:prop1> ?o1)
    +           (triple ?s <urn:ex:prop2> ?o2)
    +   ))
    +
    + + +

    That completes this brief introduction. There is much more to ARQ, of +course, but hopefully you now have a taste for what it can do.

    +
    + + + + +