lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: BytesRef comparable
Date Mon, 03 May 2010 13:07:57 GMT
In current flex, the TermsEnum has a method called getComparator() that returns the comparator
used in this TermsEnum. This is the reason behind it. If we change this to only support natural
byte[] ordering (of course unsigned, die, die Java's signedness of byte!), and codecs should
not support other ordering in future (In my opinion this would break most MTQs depending on
the order like range, fuzzy, wildcard,*), its obsolete.

But with current code the BoundedTreeSet should take the comparator provided by the TermsEnum.

Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: [] On Behalf Of Yonik
> Seeley
> Sent: Monday, May 03, 2010 2:56 PM
> To:
> Subject: Re: BytesRef comparable
> On Mon, May 3, 2010 at 6:30 AM, Michael McCandless
> <> wrote:
> > The problem is BytesRef is not really a concrete object.  It can't
> > know how the terms it's representing are supposed to sort.
> > Yet nearly all the time this sort will be lucene's default term sort
> > (only custom codecs can change this), so I'm +1 on making BytesRef
> > sort according to that (note that this is not actually natural byte[]
> > order, because we must interp the UTF8 bytes as unsigned to sort in
> > unicode code point order).
> I thought we were going to be changing lucene's index order to the
> natural byte order (same as the unicode code point order)?
> Solr's BoundedTreeSet doesn't take a comparator.  I could change it so
> that it could of course... but it just seemed natural to "fix"
> BytesRef.
> -Yonik
> Apache Lucene Eurocon 2010
> 18-21 May 2010 | Prague
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message