lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Updated: (LUCENE-2068) fix reverseStringFilter for unicode 4.0
Date Thu, 19 Nov 2009 02:27:39 GMT


Robert Muir updated LUCENE-2068:

    Attachment: LUCENE-2068.patch

This patch adds back compat for the buggy behavior with version.
It is gross because there were many public static methods exposed, but for example Solr is
using these.

Simon, are you applying patches with Eclipse?
If so it will not work, you need to open the patch in an editor, select all, copy, and then
apply from Clipboard.
In your patch, the test is corrupted, the characters should be chinese... I think this is
why you were confused about tests before.

> fix reverseStringFilter for unicode 4.0
> ---------------------------------------
>                 Key: LUCENE-2068
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 3.1
>         Attachments: LUCENE-2068.patch, LUCENE-2068.patch, LUCENE_2068.patch, LUCENE_2068.patch
> ReverseStringFilter is not aware of supplementary characters: when it reverses it will
create unpaired surrogates, which will be replaced by U+FFFD by the indexer (but not at query
> The wrong words will conflate to each other, and the right words won't match, basically
the whole thing falls apart.
> This patch implements in-place reverse with the algorithm from apache harmony AbstractStringBuilder.reverse0()

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message