lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-2015) ASCIIFoldingFilter: expose folding logic + small improvements to ISOLatin1AccentFilter
Date Thu, 29 Oct 2009 19:13:59 GMT


Uwe Schindler commented on LUCENE-2015:

We cannot apply the patch to ISOLatin1Filter, as it would break indexes already using it.
Because of that we migrated to ASCIIFoldingFilter and kept ISOLatin1Filter alive. So we should
leave it as it is.

To the buffer problem: For easy external use we could also provide a expert API that works
like the current public foldToASCII method, which is memory efficient. But may also provide
String/StringBuilder converters for external use. Internal it cannot be better as it currently
is :-)

> ASCIIFoldingFilter: expose folding logic + small improvements to ISOLatin1AccentFilter
> --------------------------------------------------------------------------------------
>                 Key: LUCENE-2015
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: C├ędrik LIME
>            Priority: Minor
>         Attachments: ASCIIFoldingFilter-no_formatting.patch, Filters.patch, ISOLatin1AccentFilter.patch
> This patch adds a couple of non-ascii chars to ISOLatin1AccentFilter (namely: left &
right single quotation marks, en dash, em dash) which we very frequently encounter in our
projects. I know that this class is now deprecated; this improvement is for legacy code that
hasn't migrated yet.
> It also enables easy access to the ascii folding technique use in ASCIIFoldingFilter
for potential re-use in non-Lucene-related code.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message