lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: lucene 2.9 sorting algorithm
Date Fri, 23 Oct 2009 04:25:17 GMT
>> he new API is much harder for the
>> average user to use, and even for the experienced user, it's not
terribly fun,
>> and more importantly:

Do we have enough info to support that though? All the cases I have seen
on the list, people have figured it out pretty easily - havn't really
seen any complaints in that regard (not counting you and John - that is
two). The only other complaints I have noticed are those that happened
to count on unsupported behavior (eg people counting on no MultiSearcher

I think Uwe had some good ideas for exposing an easier API with the new one.

Jake Mannix wrote:
> On Thu, Oct 22, 2009 at 8:30 PM, Yonik Seeley
> < <>> wrote:
>     On Thu, Oct 22, 2009 at 11:11 PM, Jake Mannix
>     < <>> wrote:
>     > It's hard to read the column format, but if you look up above in
>     the thread
>     > from tonight,
>     > you can see that yes, for PQ sizes less than 100 elements,
>     multiPQ is
>     > better, and only
>     > starts to be worse at around 100 for strings, and 50 for ints.
>     Ah, OK, I had missed John's followup with the numbers.
>     I assume this is for Java5 + optimizations?
> Yeah, this was for Java5 + optimizations.
>     What does Java6 show?
> Java6 on Mac showed close to what Mike posted in his report on the
> Jira ticket -
> that single-PQ performs a little better for small pq, and more like
> 30-40% better
> for large pq. 
>     My biggest reservation is that we've gone down the road of telling
>     people to implement a new style of comparators, and told them that the
>     old style comparators would be deleted in the next release (which is
>     where we are).  Reversing that will be a bit of a headache/question...
>     the new stuff isn't deprecated, and having *both* isn't desirable, but
>     that's a separate decision to be made apart from performance testing.
> Well the issue comes down to: if the performance is *basically comparable*
> between the two approaches, then the new API is much harder for the
> average user to use, and even for the experienced user, it's not
> terribly fun,
> and more importantly: for the user who has already implemented custom
> sorts on the old API, upgrading is enough trouble that people may decide
> it's not worth it.  It probably *is* worth it, but if you're going to
> even put that
> kind of thinking in the user's head, you've got to ask yourself:
> what's the
> reasoning for going with a more complex API if you can get equal (slightly
> better in some cases, slightly worse in others) performance with a
> simpler
> API?
> Yes, as Mike says, the new API is *not* breaking back-compat in a
> functional sense, but how many users have converted to the new sorting
> api already?  2.9 has barely just come out, and while it's work for the
> community as a whole to reconsider the multi-segment sorting api, and
> work to implement a change at this level, if it's the right thing to do,
> we shouldn't let the question of which method is deprecated dictate
> which one *should* be deprecated.
>     Is there also an option of using a multiPQ approach with the new style
>     comparators?
> For the record: that would be the worst of all worlds, in my view: harder
> API with only better performance in some cases, and sometimes worse
> performance.
>   -jake

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message