lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: Should ASCIIFoldingFilter be deprecated?
Date Tue, 08 Feb 2011 14:50:49 GMT
On Tue, Feb 8, 2011 at 9:12 AM, David Smiley (
<> wrote:

> I'm skeptical that whatever the difference is is relevant in the scheme of
> things. The cost to keeping it is introducing confusion on users, and more
> code to maintain.

its pretty significant. charfilters are not reusable, and box every
character and lookup out of a hashmap (i made a patch to fix the
reusability, but no one has commented) :

asciifoldingfilter does a huge switch (which still isnt optimal), but
its way way faster than mappingcharfilter, especially since its a
no-op for chars < 0x7F.

icufoldingfilter precompiles a recursively decomposed trie, so its
lookup is a unicode folded trie
( I think its a tad
slower than asciifoldingfilter but it also incorporates case folding
and unicode normalization: neither asciifoldingfilter nor
mappingcharfilter will not properly fold,
because there is no composed form for Z + combining cedilla, but
icufoldingfilter will.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message