lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SchemaDesign" by BrookeSchreierGanz
Date Mon, 21 May 2012 19:13:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SchemaDesign" page has been changed by BrookeSchreierGanz:
http://wiki.apache.org/solr/SchemaDesign?action=diff&rev1=12&rev2=13

Comment:
added information about BeiderMorseFilterFactory (Beider-Morse Phonetic Matching), which was
added in Solr 3.6

  == Phonemes ==
  Programmers are perfect spellers and expect the same of their users. A ''phoneme'' represents
(roughly) the sound of one syllable. Phoneme-based searching can give users a better search
experience. To support misspelled search words phoneme filters cause the index to store phoneme-base
representations of the text instead of the input. This only finds misspellings which sound
like the original word.
  
- To create a phoneme-based field, you need a text filter stack that does not include stemming
or stopwords, and add the  solr.!PhoneticFilterFactory (see AnalyzersTokenizersTokenFilters)
with one of the available encoders. This must be in both the indexing and query stack. Of
the several available the "Double Metaphone" filter is the most popular and does well with
non-English text. There are as yet no language-specific phoneme encoders.
+ To create a phoneme-based field, you need a text filter stack that does not include stemming
or stopwords.  You can then add the  solr.!PhoneticFilterFactory (see AnalyzersTokenizersTokenFilters)
with one of the available encoders. This must be in both the indexing and query stack. Of
the several available the "Double Metaphone" filter is the most popular and does well with
non-English text.
+ 
+ Newly added in Solr 3.6 was the solr.!BeiderMorseFilterFactory (see AnalyzersTokenizersTokenFilters)
which is optimized for finding surnames that sound alike but may be spelled differently, especially
Central European and Eastern European surnames.  For example, a search for the surname "Maddow"
can also turn up the name "Madoff" and vice versa.
  
  For another take on assisting spelling, see SpellCheckComponent.
  == Unicode processing ==

Mime
View raw message