lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rm...@apache.org
Subject svn commit: r936724 - /lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html
Date Thu, 22 Apr 2010 10:09:25 GMT
Author: rmuir
Date: Thu Apr 22 10:09:25 2010
New Revision: 936724

URL: http://svn.apache.org/viewvc?rev=936724&view=rev
Log:
add an example showing how to control back compat

Modified:
    lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html

Modified: lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html?rev=936724&r1=936723&r2=936724&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html (original)
+++ lucene/dev/trunk/lucene/contrib/icu/src/java/overview.html Thu Apr 22 10:09:25 2010
@@ -285,5 +285,26 @@ many character foldings recursively.
    */
   TokenStream tokenstream = new ICUFoldingFilter(tokenizer);
 </pre></code>
+<hr/>
+<h1><a name="backcompat">Backwards Compatibility</a></h1>
+<p>
+This module exists to provide up-to-date Unicode functionality that supports
+the most recent version of Unicode (currently 5.2). However, some users who wish 
+for stronger backwards compatibility can restrict
+{@link org.apache.lucene.analysis.icu.ICUNormalizer2Filter} to operate on only
+a specific Unicode Version by using a {@link com.ibm.icu.text.FilteredNormalizer2}. 
+</p>
+<h2>Example Usages</h2>
+<h3>Restricting normalization to Unicode 5.0</h3>
+<code><pre>
+  /**
+   * This filter will do NFC normalization, but will ignore any characters that
+   * did not exist as of Unicode 5.0. Because of the normalization stability policy
+   * of Unicode, this is an easy way to force normalization to a specific version.
+   */
+    Normalizer2 normalizer = Normalizer2.getInstance(null, "nfc", Normalizer2.Mode.COMPOSE);
+    FilteredNormalizer2 unicode50 = new FilteredNormalizer2(normalizer, new UnicodeSet("[:age=5.0:]"));
+    TokenStream tokenstream = new ICUNormalizer2Filter(tokenizer, unicode50);
+</pre></code>
 </body>
 </html>



Mime
View raw message