lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2514) Change Term to use bytes
Date Thu, 05 Aug 2010 07:17:16 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895579#action_12895579
] 

Robert Muir commented on LUCENE-2514:
-------------------------------------

by the way, i was thinking it would be nice to really move this slow collatedtermrangequery
stuff either out of lucene alltogether or at least into contrib/queries.

we could make things even better by removing queryparser's get/setRangeCollator method.
instead in its place, it could have something like a boolean 'analyzeRangeQueries' ?
it could then analyze the endpoints (producing byte collation keys) and use a regular fast
term range query.

I think its good to support collation order for people who want it, but we should make it
easy to do things the fast way, 
right now we make it easy to do things the slow way and hard to do it fast.


> Change Term to use bytes
> ------------------------
>
>                 Key: LUCENE-2514
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2514
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: Search
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>         Attachments: LUCENE-2514-MTQPagedBytes.patch, LUCENE-2514-MTQPagedBytes.patch,
LUCENE-2514-MTQPagedBytes.patch, LUCENE-2514-surrogates-dance.patch, LUCENE-2514.patch, LUCENE-2514.patch,
LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch,
LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch,
LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514_collatedrange.patch, LUCENE-2514_qp.patch
>
>
> in LUCENE-2426, the sort order was changed to codepoint order.
> unfortunately, Term is still using string internally, and more importantly its compareTo()
uses the wrong order [utf-16].
> So MultiTermQuery, etc (especially its priority queues) are currently wrong.
> By changing Term to use bytes, we can also support terms encoded as bytes such as numerics,
instead of using
> strange string encodings.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message