lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "UnicodeCollation" by RobertMuir
Date Thu, 03 Mar 2011 03:23:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "UnicodeCollation" page has been changed by RobertMuir.
The comment on this change is: add an example for ICU collation.
http://wiki.apache.org/solr/UnicodeCollation?action=diff&rev1=4&rev2=5

--------------------------------------------------

  = Unicode Collation =
- <!> [[Solr1.5]]
+ <!> [[Solr3.1]]
  
  == Overview ==
  [[http://en.wikipedia.org/wiki/Unicode_collation_algorithm|Unicode Collation]] is a method
to sort text in a language-sensitive way. It is primarily intended for sorting, but can also
be used for advanced search purposes.
@@ -144, +144 @@

  
  Please note that the strange output you see from the filter is really a binary collation
key encoded in a special form. What is important is that it is the same value for equivalent
tokens as defined by that collator.
  
+ == ICU Collation ==
+ 
+ For better performance, less memory usage, and support for more locales, you can add the
analysis-extras contrib and use ICUCollationKeyFilterFactory instead. See the [[http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/analysis-extras/src/java/org/apache/solr/analysis/ICUCollationKeyFilterFactory.java|javadocs]]
for more information.
+ 
+ In general, the principles are the same, you just specify an RFC3066 language identifier
with the locale parameter instead of specifying language+country+variant.
+ 
+ For example, to get German phonebook sort order:
+ 
+ {{{
+ <fieldType name="collatedICU" class="solr.TextField">
+   <analyzer>
+     <tokenizer class="solr.KeywordTokenizerFactory"/>
+     <filter class="solr.ICUCollationKeyFilterFactory"
+         locale="de@collation=phonebook"
+         strength="primary"
+     />
+   </analyzer>
+ </fieldType>
+ }}}
+ 
+ To use this filter, see solr/contrib/analysis-extras/README.txt for instructions on which
jars you need to add to your SOLR_HOME/lib
+ 

Mime
View raw message