lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Build failed in Hudson: Lucene-trunk #1187
Date Fri, 14 May 2010 15:27:45 GMT
On Fri, May 14, 2010 at 11:23 AM, Yonik Seeley
<yonik@lucidimagination.com> wrote:
> On Fri, May 14, 2010 at 11:21 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> On Fri, May 14, 2010 at 10:59 AM, Yonik Seeley
>> <yonik@lucidimagination.com> wrote:
>>> On Fri, May 14, 2010 at 7:29 AM, Robert Muir <rcmuir@gmail.com> wrote:
>>>> On Fri, May 14, 2010 at 5:14 AM, Michael McCandless
>>>> <lucene@mikemccandless.com> wrote:
>>>>> Or just cutover to UTF8 order for trunk.
>>>>
>>>> I would really prefer we go this route, instead of trying to do any
>>>> hacks at this point!
>>>
>>> Sounds good...
>>> So it seems like the biggest issue we might have in cutting over would
>>> be the field cache and sorting?  Instead of using String.compareTo we
>>> need one that compares as UTF-32 (or longer term, don't even create
>>> strings of course...)
>>
>> Actually, I think on changing to unicode codepoint order, the
>> StringIndex returned by FieldCache would in fact be sorted in
>> codepoint order (even though it's still a String[]), because it just
>> enums the terms from TermsEnum.
>
> Right... the FIeldCache will be ordered correctly... but when the sort
> code compares values across segments?

Ahh yes we'd have to use a comparator based on codepoint, not
String.compareTo, at that point.

I think we should first fix FieldCache to return BytesRef-based
getStrings/getStringIndex (LUCENE-2380).... I'll go take it.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message