lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Search both diacritics and non-diacritics
Date Sun, 03 Jan 2010 00:31:01 GMT
The ASCIIFoldingFilter is a superset of the ISOLatin1Filter -
ISOLatin1 is deprecated.  Here's the Javadoc from ASCIIFoldingFIlter.
You did not mention which language you want to search.

Unforch, the ASCIIFoldingFilter is not mentioned on the Solr wiki.

This class converts alphabetic, numeric, and symbolic Unicode
characters which are not in the first 127 ASCII characters (the "Basic
Latin" Unicode block) into their ASCII equivalents, if one exists.
Characters from the following Unicode blocks are converted; however,
only those characters with reasonable ASCII alternatives are

C1 Controls and Latin-1 Supplement:
Latin Extended-A:
Latin Extended-B:
Latin Extended Additional:
Latin Extended-C:
Latin Extended-D:
IPA Extensions:
Phonetic Extensions:
Phonetic Extensions Supplement:
General Punctuation:
Superscripts and Subscripts:
Enclosed Alphanumerics:
Supplemental Punctuation:
Alphabetic Presentation Forms:
Halfwidth and Fullwidth Forms:
See: The set
of character conversions supported by this class is a superset of
those supported by Lucene's ISOLatin1AccentFilter which strips accents
from Latin1 characters. For example, 'à' will be replaced by 'a'.

View raw message