lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-2354) Convert NumericUtils and NumericTokenStream to use BytesRef instead of Strings/char[]
Date Mon, 29 Mar 2010 17:23:27 GMT


Uwe Schindler commented on LUCENE-2354:

bq. But the encoding is unchanged right? (Ie only using 7 bits per byte, same as trunk).

Yes. And i think we should keep it for now using 7 bit. Problems start when the sort order
of terms is needed (which is the case for NRQ). As default in flex is the UTF-8 term comparator,
it would not sort correctly for numeric fields with full 8 bits?

bq. And you cutover to BytesRef TermsEnum API too - great. Presumably search perf would improve
but only a tiny bit since NRQ visits so few terms?

I dont think you will notice a difference. A standard int range contains maybe 10 to 20 sub-ranges
(at maximum), so converting between string and TermRef should not count. But the new implementation
is more clean. In principle we could remove the whole char[]/String based API in NumericUtils
- I only have to rewrite the tests and remove the NumericUtils test in backwards (as no longer
applies then, too).

> Convert NumericUtils and NumericTokenStream to use BytesRef instead of Strings/char[]
> -------------------------------------------------------------------------------------
>                 Key: LUCENE-2354
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>         Attachments: LUCENE-2354.patch
> After LUCENE-2302, we should use TermToBytesRefAttribute to index using NumericTokenStream.
This also should convert the whole NumericUtils to use BytesRef when converting numerics.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message