lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1279) RangeQuery and RangeFilter should use collation to check for range inclusion
Date Tue, 06 May 2008 04:37:55 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594464#action_12594464
] 

Hoss Man commented on LUCENE-1279:
----------------------------------

a few random thoughts:

1) you should be able to at least start the enumerator by skiping to a term consisting of
the lowerTermField and the termText of "" ... even if the Collation of the term text is random,
you still know which field you want.

2) why can a collator only be specified by a Locale, why not just let people specify the Collator
they want directly?

3) instead of adding a new public CollatingRangeQuery, would it make more sense to add an
optional Collator to RangeQuery (and RangeFilter) which triggers a different code path when
non null?  (from a performance standpoint it would basically be one conditional check at the
begining of the rewrite method.)

4) when i first saw the thread that spawned this issue, my first reaction was to wonder if
it would make sense to start allowing a Collator to be specified when indexing, and to use
the raw bytes from the CollationKey as the indexed value -- I haven't thought it through very
hard, but i wonder if that would be feasible (it seems like it would certainly faster at query
time, since it would allow more traditional term skipping.

> RangeQuery and RangeFilter should use collation to check for range inclusion
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-1279
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1279
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.3.1
>            Reporter: Steven Rowe
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: LUCENE-1279.patch
>
>
> See [this java-user discussion|http://www.nabble.com/lucene-farsi-problem-td16977096.html]
of problems caused by Unicode code-point comparison, instead of collation, in RangeQuery.
> RangeQuery could take in a Locale via a setter, which could be used with a java.text.Collator
and/or CollationKey's, to handle ranges for languages which have alphabet orderings different
from those in Unicode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message