lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3747) Support Unicode 6.1.0
Date Tue, 17 Jul 2012 14:09:35 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416227#comment-13416227
] 

Robert Muir commented on LUCENE-3747:
-------------------------------------

Basically Steve, my opinion is if we have a good way to script this thing, we should just
try to come
up with some appropriate Sets for this stuff and automate it. It doesn't need to be perfect.

And then go forward from there with fine tuning the script... but I think automation should
be 
the priority!
                
> Support Unicode 6.1.0
> ---------------------
>
>                 Key: LUCENE-3747
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3747
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.5, 4.0-ALPHA
>            Reporter: Steven Rowe
>            Priority: Minor
>         Attachments: LUCENE-3747.patch
>
>
> Now that Unicode 6.1.0 has been released, Lucene/Solr should support it.
> JFlex trunk now supports Unicode 6.1.0.
> Tasks include:
> * Upgrade ICU4J to v49 (after it's released, on 2012-03-21, according to http://icu-project.org).
> * Use {{icu}} module tools to regenerate the supplementary character additions to JFlex
grammars.
> * Version the JFlex grammars: copy the current implementations to {{*Impl3<X>}};
cause the versioning tokenizer wrappers to instantiate this version when the {{Version}} c-tor
param is in the range 3.1 to the version in which these changes are released (excluding the
range endpoints); then change the specified Unicode version in the non-versioned JFlex grammars
from 6.0 to 6.1.
> * Regenerate JFlex scanners, including {{StandardTokenizerImpl}}, {{UAX29URLEmailTokenizerImpl}},
and {{HTMLStripCharFilter}}.
> * Using {{generateJavaUnicodeWordBreakTest.pl}}, generate and then run {{WordBreakTestUnicode_6_1_0.java}}
 under {{modules/analysis/common/src/test/org/apache/lucene/analysis/core/}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message