lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2094) Prepare CharArraySet for Unicode 4.0
Date Tue, 24 Nov 2009 16:25:39 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782009#action_12782009
] 

Robert Muir commented on LUCENE-2094:
-------------------------------------

Simon, yeah. its tricky you know, like many suppl. char issues.

even if we provide perfect backwards compatibility with what 3.0 did, if you care about these
languages, you *WANT* to reindex, because stuff wasn't working at all before.
and if you really care, you weren't using any of lucene's analysis components anyway (except
maybe WhitespaceTokenizer).
For example, StandardAnalyzer currently discards these characters anyway.

but we don't want to screw over CJK users where things might have been "mostly" working before,
either.
In this case, CJK is completely unaffected, I think we should not use version here or in any
other lowercasing fixes, including LowerCaseFilter itself.


> Prepare CharArraySet for Unicode 4.0
> ------------------------------------
>
>                 Key: LUCENE-2094
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2094
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2, 2.3.3, 2.4, 2.4.1, 2.4.2,
2.9, 2.9.1, 2.9.2, 3.0, 3.0.1, 3.1
>            Reporter: Simon Willnauer
>             Fix For: 3.1
>
>         Attachments: LUCENE-2094.txt, LUCENE-2094.txt, LUCENE-2094.txt
>
>
> CharArraySet does lowercaseing if created with the correspondent flag. This causes that
 String / char[] with uncode 4 chars which are in the set can not be retrieved in "ignorecase"
mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message