lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DM Smith (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1793) remove custom encoding support in Greek/Russian Analyzers
Date Sun, 09 Aug 2009 17:40:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741109#action_12741109
] 

DM Smith commented on LUCENE-1793:
----------------------------------

bq.If this is the concern, then I think a better solution would be to integrate some form
of unicode compression (i.e. BOCU-1) into lucene, rather than try to deal with legacy character
sets in this way.

So it doesn't get lost, would it be good to open an issue for this? And for alternate encodings?

> remove custom encoding support in Greek/Russian Analyzers
> ---------------------------------------------------------
>
>                 Key: LUCENE-1793
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1793
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Priority: Minor
>         Attachments: LUCENE-1793.patch
>
>
> The Greek and Russian analyzers support custom encodings such as KOI-8, they define things
like Lowercase and tokenization for these.
> I think that analyzers should support unicode and that conversion/handling of other charsets
belongs somewhere else. 
> I would like to deprecate/remove the support for these other encodings.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message