lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Updated: (LUCENE-2514) Change Term to use bytes
Date Thu, 24 Jun 2010 21:28:50 GMT


Uwe Schindler updated LUCENE-2514:

    Attachment: LUCENE-2514.patch

Here robert's patch with MTQ changed.

It currently still uses placeholderTerms to not need to intern every time. If we remove string
interning from Term, we can replace this by simple new Term() in MTQ.

I delayed cloning of BytesRef until the BytesRef is put into a TermQuery or PQ or whenever
it is set aside. But it no longer clones it e.g. if the term is never accepted by the PQ.
Also the PQ reuses its ScoreTerm instances and so, the term bytes are simply copied over :-)

I also removed a Java 1.6 interface override - the Generics Policeman gives a ticket! I don't
understand where those come from, Java 1.6 should also fail to compile as the ant build uses
-source 1.5...?

> Change Term to use bytes
> ------------------------
>                 Key: LUCENE-2514
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: Search
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-2514-surrogates-dance.patch, LUCENE-2514.patch, LUCENE-2514.patch,
LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch
> in LUCENE-2426, the sort order was changed to codepoint order.
> unfortunately, Term is still using string internally, and more importantly its compareTo()
uses the wrong order [utf-16].
> So MultiTermQuery, etc (especially its priority queues) are currently wrong.
> By changing Term to use bytes, we can also support terms encoded as bytes such as numerics,
instead of using
> strange string encodings.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message