lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Build failed in Hudson: Lucene-trunk #1187
Date Fri, 14 May 2010 09:14:45 GMT
Wow another issue caught by random testing!

On Fri, May 14, 2010 at 1:42 AM, Robert Muir <> wrote:
> the problem is a logic bug (e.g. i have no clue how to really fix
> except to switch over to a UTF-8 sort order).
> in converting automaton to utf-8/32, and trying to emulate the utf-16
> term dictionary order, the byte transition ranges (although sorted in
> utf-16 order) are themselves in utf-8/32 order: e.g. a byte range of
> 0xe0-0xef is problematic during enumeration since the 0xee-0xef
> component should be "sorted last" in utf-16 order.

Ugh.  I suppose we could forcefully split such edges?  (We'd have to
fix reduce to not consolidate them).

Or just cutover to UTF8 order for trunk.

> i know a workaround until we switch over, but its gonna cause wasted
> seeks at the least (its just wrong).

This is the FIXME you committed right?  Ie always seek...


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message