lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] Updated: (SOLR-1571) unicode collation support
Date Thu, 27 May 2010 23:15:40 GMT


Hoss Man updated SOLR-1571:

    Fix Version/s: 3.1

Correcting Fix Version based on CHANGES.txt, see this thread for more details...

> unicode collation support
> -------------------------
>                 Key: SOLR-1571
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5, 3.1, 4.0
>         Attachments: SOLR-1571.patch
> This patch adds support for unicode collation (searching and sorting).
> Unicode collation is helpful in a search engine, for many languages you want things to
match or sort differently.
> You might even want to use copyfield and support different sort orders/matching schemes
if you need to support multiple languages.
> This is simply a factory for lucene's CollationKeyFilter, which indexes binary collation
keys in a special format that preserves binary sort order.
> I've added support for creating a Collator in two ways:
> * system collator from a Locale spec (language + country + variant)
> * tailored collator from custom rules in a text file
> in no way is there an option to use the "default" locale of the jvm, (I consider this
a bit dangerous)
> in this patch, it is mandatory to define the locale explicitly for a system collator.
> The required lucene-collation-2.9.1.jar is only 12KB.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message