jena-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r972682 - in /websites/staging/jena/trunk/content: ./ documentation/query/text-query.html
Date Tue, 17 Nov 2015 10:14:33 GMT
Author: buildbot
Date: Tue Nov 17 10:14:33 2015
New Revision: 972682

Log:
Staging update by buildbot for jena

Modified:
    websites/staging/jena/trunk/content/   (props changed)
    websites/staging/jena/trunk/content/documentation/query/text-query.html

Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Nov 17 10:14:33 2015
@@ -1 +1 @@
-1711735
+1714748

Modified: websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html (original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Tue Nov 17 10:14:33
2015
@@ -423,16 +423,44 @@ Lucene index.  For example:</p>
 
 <p>will configure the index to analyze values of the 'text' field
 using a <code>StandardAnalyzer</code> with the given list of stop words.</p>
-<p>Other analyzer types that may be specified are <code>SimpleAnalyzer</code>
and <code>KeywordAnalyzer</code>,
-neither of which has any configuration parameters. See the Lucene documentation
-for details of what these analyzers do. 
-In addition, Jena provides <code>LowerCaseKeywordAnalyzer</code>,
-which is a case-insensitive version of <code>KeywordAnalyzer</code>.</p>
-<p>In Jena 3.0.0:</p>
-<p>Support for the new <code>LocalizedAnalyzer</code> has been introduced
to deal with Lucene 
-language specific analyzers. 
-See <a href="#linguistic-support-with-lucene-index">Linguistic Support with Lucene
Index</a>
-part for details.</p>
+<p>Other analyzer types that may be specified are <code>SimpleAnalyzer</code>
and
+<code>KeywordAnalyzer</code>, neither of which has any configuration parameters.
See
+the Lucene documentation for details of what these analyzers do. Jena also
+provides <code>LowerCaseKeywordAnalyzer</code>, which is a case-insensitive version
of
+<code>KeywordAnalyzer</code>, and <code>ConfigurableAnalyzer</code>
(see below).</p>
+<p>Support for the new <code>LocalizedAnalyzer</code> has been introduced
in Jena 3.0.0 to
+deal with Lucene language specific analyzers. See <a href="#linguistic-support-with-lucene-index">Linguistic
Support with
+Lucene Index</a> part for details.</p>
+<h4 id="configurableanalyzer">ConfigurableAnalyzer<a class="headerlink" href="#configurableanalyzer"
title="Permanent link">&para;</a></h4>
+<p><code>ConfigurableAnalyzer</code> was introduced in Jena 3.0.1. It allows
more detailed
+configuration of text analysis parameters by independently selecting a
+<code>Tokenizer</code> and zero or more <code>TokenFilter</code>s
which are applied in order after
+tokenization. See the Lucene documentation for details on what each
+tokenizer and token filter does.</p>
+<p>The available <code>Tokenizer</code> implementations are:</p>
+<ul>
+<li><code>StandardTokenizer</code></li>
+<li><code>KeywordTokenizer</code></li>
+<li><code>WhitespaceTokenizer</code></li>
+<li><code>LetterTokenizer</code></li>
+</ul>
+<p>The available <code>TokenFilter</code> implementations are:</p>
+<ul>
+<li><code>StandardFilter</code></li>
+<li><code>LowerCaseFilter</code></li>
+<li><code>ASCIIFoldingFilter</code></li>
+</ul>
+<p>Configuration is done using Jena assembler like this:</p>
+<div class="codehilite"><pre><span class="n">text</span><span
class="o">:</span><span class="n">analyzer</span> <span class="o">[</span>
+  <span class="n">a</span> <span class="n">text</span><span class="o">:</span><span
class="n">ConfigurableAnalyzer</span> <span class="o">;</span>
+  <span class="n">text</span><span class="o">:</span><span class="n">tokenizer</span>
<span class="n">text</span><span class="o">:</span><span class="n">KeywordTokenizer</span>
<span class="o">;</span>
+  <span class="n">text</span><span class="o">:</span><span class="n">filters</span>
<span class="o">(</span><span class="n">text</span><span class="o">:</span><span
class="n">ASCIIFoldingFilter</span><span class="o">,</span> <span
class="n">text</span><span class="o">:</span><span class="n">LowerCaseFilter</span><span
class="o">)</span>
+<span class="o">]</span>
+</pre></div>
+
+
+<p>Here, <code>text:tokenizer</code> must be one of the four tokenizers
listed above and
+the optional <code>text:filters</code> property specifies a list of token filters.</p>
 <h4 id="analyzer-for-query">Analyzer for Query<a class="headerlink" href="#analyzer-for-query"
title="Permanent link">&para;</a></h4>
 <p>New in Jena 2.13.0.</p>
 <p>There is an ability to specify an analyzer to be used for the



Mime
View raw message