lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5357) Upgrade StandardTokenizer & co to latest unicode rules
Date Sat, 07 Dec 2013 00:06:35 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841918#comment-13841918
] 

ASF subversion and git services commented on LUCENE-5357:
---------------------------------------------------------

Commit 1548762 from [~steve_rowe] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1548762 ]

LUCENE-5357: Upgrade StandardTokenizer and UAX29URLEmailTokenizer to Unicode 6.3; update UAX29URLEmailTokenizer's
recognized top level domains in URLs and Emails from the IANA Root Zone Database; add std40/StandardTokenizerImpl40
and std40/UAX29URLEmailTokenizerImpl40, for backcompat from 4.0->4.6.  (merged trunk r1548595
and r1548746)

> Upgrade StandardTokenizer & co to latest unicode rules
> ------------------------------------------------------
>
>                 Key: LUCENE-5357
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5357
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>         Attachments: LUCENE-5357.patch
>
>
> besides any change in data, the rules have also changed (regional indicators, better
handling for hebrew, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message