lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Rowe (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-1279) RangeQuery and RangeFilter should use collation to check for range inclusion
Date Mon, 05 May 2008 04:40:55 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steven Rowe updated LUCENE-1279:
--------------------------------

    Attachment: LUCENE-1279.patch

Attaching a patch containing class CollatingRangeQuery, which extends RangeQuery, overriding
the rewrite() method.  A test class is also supplied.  This is targetted at contrib/.

Because *all* index terms in the Field of the lower and upper terms of the range have to be
examined, since index term ordering (Unicode code point order) is not necessarily the same
as the collation in the given Locale, CollatingRangeQuery's will be significantly slower than
the RangeQuery's.

One of the tests uses some of the Farsi information Esra supplied in the original post.  Note
that neither Java 1.4.2 nor 1.5.0 contains collation information for Farsi.  Instead, the
test uses the Arabic Locale, which appears to contain the proper letter ordering for the non-Arabic
Farsi letters.

> RangeQuery and RangeFilter should use collation to check for range inclusion
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-1279
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1279
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.3.1
>            Reporter: Steven Rowe
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: LUCENE-1279.patch
>
>
> See [this java-user discussion|http://www.nabble.com/lucene-farsi-problem-td16977096.html]
of problems caused by Unicode code-point comparison, instead of collation, in RangeQuery.
> RangeQuery could take in a Locale via a setter, which could be used with a java.text.Collator
and/or CollationKey's, to handle ranges for languages which have alphabet orderings different
from those in Unicode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message