incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject LARQ documentation
Date Tue, 10 May 2011 22:33:06 GMT
below are two new paragraphs which could be added at the bottom of the current
lucene-arq.html page.



<h2>A new LARQ module</h2>
  A new LARQ is available as a separate module from ARQ, this enables the two
  modules to have independent release cycles. Lucene dependency has
been upgraded
  from v2.3.1 to v3.1.0 (i.e. the latest stable Lucene release). Two
other improvements
  to LARQ are the support for index removals/deletions that can be
used to keep a
  Lucene index in sync with an RDF Dataset/DataSource as RDF triples
are added or
  removed to it and the duplicate avoidance using the Lucene index
itself instead
  of in memory data structures. These two improvements required an
additional field
  to Lucene index, therefore a reindex is necessary to use the new
LARQ module.</p>
  Once LARQ is included in the classpath, larq.larqbuilder and larq.larq helper
  commands are available. They works the same as the arq.larqbuilder
and arq.larq
  commands, with only one additional option for larq.larqbuilder:
  <li><code>--allow-duplicates</code> : Suppress duplicate avoidance
using Lucene
  index, this is recommended for bulk indexing large RDF datasets
(even if it might
  add a few duplicate documents to the Lucene index).</li>
  The new LARQ module is distributed as a Maven artifact and it can be
included in
  a project, as any other dependency, using:</p>
<pre class="box">

<h2>Enabling LARQ for RDF Datasets via an Assembler specification</h2>
  It is possible to attach an exiting Lucene index built by
larqbuilder to an RDF
  Dataset using the ja:textIndex property. For example, this is the assembler
  specification of a TDB Dataset with LARQ enabled:</p>

<pre class="box">
@prefix rdf: &lt;; .
@prefix rdfs: &lt;; .
@prefix ja: &lt;; .
@prefix tdb: &lt;; .
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .

<#dataset> rdf:type tdb:DatasetTDB ;
  tdb:location "/path/to/tdb/indexes/" ;
  ja:textIndex "/path/to/lucene/index/" ;

View raw message