jena-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r858015 - in /websites/staging/jena/trunk/content: ./ documentation/query/text-query.html
Date Wed, 10 Apr 2013 17:17:07 GMT
Author: buildbot
Date: Wed Apr 10 17:17:07 2013
New Revision: 858015

Log:
Staging update by buildbot for jena

Modified:
    websites/staging/jena/trunk/content/   (props changed)
    websites/staging/jena/trunk/content/documentation/query/text-query.html

Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Apr 10 17:17:07 2013
@@ -1 +1 @@
-1466572
+1466573

Modified: websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html (original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Wed Apr 10 17:17:07
2013
@@ -182,7 +182,7 @@ the actual label.  More details are give
 <h2 id="table-of-contents">Table of Contents</h2>
 <ul>
 <li><a href="#architecture">Architecture</a></li>
-<li><a href="#with-with-sparql">Query with SPARQL</a></li>
+<li><a href="#query-with-sparql">Query with SPARQL</a></li>
 <li><a href="#configuration">Configuration</a></li>
 <li><a href="#fuseki">Working with Fuseki</a></li>
 </ul>
@@ -198,11 +198,11 @@ or
 properties work with.  When data is added, any properties matching the
 description caus an entry to be added from analysed text from the triple
 object and mapping to the subject.</p>
-<h3 id="pattern-a-rdf-data">Pattern A: RDF data</h3>
+<h3 id="pattern-a-rdf-data">Pattern A -- RDF data</h3>
 <p>In this pattern, the data in the text index is indexing literals in the RDF data.<br
/>
 Additions to the RDF data are reflected in additions to the index.</p>
 <p>(Deletes do not remove text index netries - <a href="#deletion">see below</a>)</p>
-<h3 id="pattern-b-external-content">Pattern B: External content</h3>
+<h3 id="pattern-b-external-content">Pattern B -- External content</h3>
 <p>There is no requirement that the text data indexed is present in the RDF
 data.  As long as the index contains the index text documents to match the
 index description, then text search can be performed.</p>
@@ -262,7 +262,7 @@ surrounding <code>( )</code> can be omit
 </tr>
 </tbody>
 </table>
-<h2 id="good-practice">Good practice</h2>
+<h3 id="good-practice">Good practice</h3>
 <p>The query execution does not know the selectivity of the text index.  It is
 better to use one of two styles.</p>
 <h3 id="query-pattern-1-find-in-the-text-index-and-enhance-results">Query pattern 1
: Find in the text index and enhance results</h3>
@@ -290,28 +290,6 @@ used to restrict the items found stil fu
 </pre></div>
 
 
-<h2 id="deletion">Deletion</h2>
-<p>If the text index is being maintain by changed to the RDF, then deletion of
-RDF triple or quads does not cause entries in the index to be removed.  The
-index does not store the literal indexed, nor does it store a reference
-count of how many triples refer to the index so the information to delete
-entries is not available. </p>
-<p>In situations where this matters, the SPARQL query should look up in the
-text index, then check in the RDF data.  Indeed, this may be necessary
-anyway because a text search does not necessarily give only exact matches.</p>
-<p>In the initial example:</p>
-<div class="codehilite"><pre><span class="n">SELECT</span> <span
class="p">?</span><span class="n">s</span> <span class="p">?</span><span
class="n">label</span>
-<span class="p">{</span> <span class="p">?</span><span class="n">s</span>
<span class="n">text:query</span> <span class="p">(</span><span
class="n">rdfs:label</span> <span class="s">&#39;word&#39;</span>
<span class="mi">10</span><span class="p">)</span> <span class="p">;</span>

-     <span class="n">rdfs:label</span> <span class="p">?</span><span
class="n">label</span> 
-<span class="p">}</span>
-</pre></div>
-
-
-<p>the SPARQL query is checking that the <code>rdfs:label</code> triple
exists, and if it
-does, returning the whole label.</p>
-<p>Bu only indexing, and not storing, literals, the index is kept smaller.  It
-may be necessary to periodically rebuild the index if a large proportion
-of the RDF data changes.</p>
 <h2 id="configuration">Configuration</h2>
 <p>The important structure is an "entity map" which defines the properties to
 index, the name of the lucene/solr field and filed used for storing the URI
@@ -335,7 +313,7 @@ index field.</p>
 <span class="n">tdb:DatasetTDB</span>  <span class="n">rdfs:subClassOf</span>
 <span class="n">ja:RDFDataset</span> <span class="o">.</span>
 <span class="n">tdb:GraphTDB</span>    <span class="n">rdfs:subClassOf</span>
 <span class="n">ja:Model</span> <span class="o">.</span>
 
-<span class="c1">## Initialize LARQ</span>
+<span class="c1">## Initialize text query</span>
 <span class="o">[]</span> <span class="n">ja:loadClass</span>   
   <span class="s">&quot;org.apache.jena.query.text.TextQuery&quot;</span>
<span class="o">.</span>
 <span class="c1"># A TextDataset is a regular dataset with a text index.</span>
 <span class="n">text:TextDataset</span>      <span class="n">rdfs:subClassOf</span>
  <span class="n">ja:RDFDataset</span> <span class="o">.</span>
@@ -404,7 +382,7 @@ needs to identify the text dataset by it
 </pre></div>
 
 
-<h3 id="fuseki">Fuseki</h3>
+<h2 id="fuseki">Fuseki</h2>
 <p>The Fuseki configuration simply points to the text dataset as the
 <code>fuseki:dataset</code> of the service.</p>
 <div class="codehilite"><pre><span class="sr">&lt;#service_text_tdb&gt;</span>
<span class="n">rdf:type</span> <span class="n">fuseki:Service</span>
<span class="p">;</span>
@@ -419,6 +397,30 @@ needs to identify the text dataset by it
     <span class="n">fuseki:dataset</span>                  <span class="p">:</span><span
class="n">text_dataset</span> <span class="p">;</span>
     <span class="o">.</span>
 </pre></div>
+
+
+<h2 id="deletion">Deletion</h2>
+<p>If the text index is being maintain by changed to the RDF, then deletion of
+RDF triple or quads does not cause entries in the index to be removed.  The
+index does not store the literal indexed, nor does it store a reference
+count of how many triples refer to the index so the information to delete
+entries is not available. </p>
+<p>In situations where this matters, the SPARQL query should look up in the
+text index, then check in the RDF data.  Indeed, this may be necessary
+anyway because a text search does not necessarily give only exact matches.</p>
+<p>In the initial example:</p>
+<div class="codehilite"><pre><span class="n">SELECT</span> <span
class="p">?</span><span class="n">s</span> <span class="p">?</span><span
class="n">label</span>
+<span class="p">{</span> <span class="p">?</span><span class="n">s</span>
<span class="n">text:query</span> <span class="p">(</span><span
class="n">rdfs:label</span> <span class="s">&#39;word&#39;</span>
<span class="mi">10</span><span class="p">)</span> <span class="p">;</span>

+     <span class="n">rdfs:label</span> <span class="p">?</span><span
class="n">label</span> 
+<span class="p">}</span>
+</pre></div>
+
+
+<p>the SPARQL query is checking that the <code>rdfs:label</code> triple
exists, and if it
+does, returning the whole label.</p>
+<p>By only indexing, and not storing, literals, the index is kept smaller.  It
+may be necessary to periodically rebuild the index if a large proportion
+of the RDF data changes.</p>
   </div>
 
   <div id="footer">



Mime
View raw message