marmotta-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sschaff...@apache.org
Subject svn commit: r1539974 - /incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm
Date Fri, 08 Nov 2013 10:54:22 GMT
Author: sschaffert
Date: Fri Nov  8 10:54:21 2013
New Revision: 1539974

URL: http://svn.apache.org/r1539974
Log:
better SPARQL documentation

Modified:
    incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm

Modified: incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm?rev=1539974&r1=1539973&r2=1539974&view=diff
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm (original)
+++ incubator/marmotta/site/trunk/content/markdown/kiwi/sparql.md.vm Fri Nov  8 10:54:21 2013
@@ -11,6 +11,9 @@ cases by translating parts of a SPARQL q
 * full-text search (Marmotta 3.2 and above): adds additional full-text search functions to
SPARQL that can be used in
   the FILTER part of a query (see below)
 
+Also, result iterators of an optimized query operate directly on database cursors, so they
will be very efficient in
+case only a few results will be retrieved.
+
 Note that KiWi SPARQL does not translate the complete query to SQL. Instead, it walks through
the abstract syntax
 tree of a query and optimizes those parts where it can reliably do so and where it makes
sense. This allows us to
 make efficient use of the performance of the underlying database while at the same time retaining
the flexibility
@@ -83,19 +86,16 @@ literal language of dc:description:
 Performance Considerations
 --------------------------
 
-Even though the reasoner is efficient compared with many other reasoners, there are a number
of things to take into
-account, because reasoning is always a potentially expensive operation:
-
-* reasoning will always terminate, but the upper bound for inferred triples is in theory
the set of all combinations
-  of nodes occurring in base triples in the database used as subject, predicate, or object,
i.e. n^3
-* specific query patterns with many ground values are more efficient than patterns with many
variables, as fixed
-  values can considerably reduce the candidate results in the SQL queries while variables
are translated into SQL
-  joins
-* re-running a full reasoning can be extremely costly on large databases, so it is better
configuring the reasoning
-  programs before importing large datasets (large being in the range of millions of triples)
-* updating a program is more efficient than first deleting the old version and then adding
the new version,
-  because the reasoner compares old and new program and only updates the changed rules
-
-In addition, the reasoner is currently executed in a single worker thread. The main reason
is that otherwise there
-are potentially many transaction conflicts. We are working on an improved version that could
benefit more from
-multi-core processors.
+In practice, the KiWi SPARQL module seriously improves the performance of most SPARQL queries
(and even updates) and
+should therefore almost always be used in conjunction with the KiWi triple store. However,
there is no magic, and you
+need to keep in mind that certain queries will still be problematic. To improve SPARQL performance,
try to follow the
+following recommendations:
+
+* avoid DISTINCT, ORDER BY, GROUP BY: filtering out duplicates is a performance killer, as
it requires to first load
+  all results into memory; if you do not strictly need it, do not use it
+* avoid OPTIONAL: optional queries are currently not optimized, as the semantics of OPTIONAL
in SPARQL slightly differs
+  from the semantics of an SQL left join
+* avoid subselects: a join with a subselect currently cannot be optimized, because KiWi SPARQL
does not work on the
+  results of a SPARQL query, only on the conditions
+* use FILTER: conditions in the FILTER part of a query will be translated into WHERE conditions
in SQL; the more precise
+  your filter conditions are, the better your query will perform



Mime
View raw message