lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: lucene 2.9 sorting algorithm
Date Fri, 23 Oct 2009 14:43:47 GMT
Agreed: so far I'm seeing serious performance loss with MultiPQ,
especially as topN gets larger, and for int sorting.

For small queue, String sort, it sometimes wins.

So if I were forced to decide now based on the current results, I
think we should keep the single PQ API.

But: I am right now optimizing John's patch to see how fast Multi PQ
can get.  I'll post it once I get it working, and post output from
re-running on my opensolaris box.

Mike

2009/10/23 Mark Miller <markrmiller@gmail.com>:
>>>I still think we should if performance is no
>>>better with the new one.
>
> Where is there any indication performance is not better with the new one?
>
> The benchmarks are clearly against switching back. At best they could argue for two API's
- even then it depends - a loss of 10% on Java 1.5
> with the most recent linux for a topn:10 ? I'm all for more results, but its not looking
like a good switch to me. What API do I use? Well, it depends - how many docs will you ask
for back, what OS are running, how hard is it for you to grok one API over the other?
>
> And then as we make changes in the future we have to manage both APIs.
>
> bq. digging in deep and running thorough perf tests makes sense
>
> Again - no one is arguing against - dig all year - I'll help - but I don't see the treasure
yet, and the hole is starting to look deep.
>
> bq. removing that if from the Multi PQ patch makes sense
>
> I didn't have a problem with that either - or other code changes - but
> jeeze, mention what you are seeing with the switch. I'll tell you what I
> saw it - not that much - a bit of improvement, but take a look at the
> Java 1.5 run - it ended up being a blade of grass holding up a boulder
> on Linux.
>
>
>
> Michael McCandless wrote:
>> Sheesh I go to bed and so much all of a sudden happens!!
>>
>> Sorry Mark; I should've called out "PATCH IS ON 2.9 BRANCH" more
>> clearly ;)
>>
>> There's no question in my mind that the new comparator API is more
>> complex than the old one, and I really don't like that.  I had to
>> rewrite the section of LIA that gives an example of a [simple] custom
>> sort and it wasn't pleasant!  Two compare methods (compare,
>> compareBottom)?  Two copy methods (copy, setBottom)?  Sure, you can
>> grok it and get through it if you have to, but it is more complex
>> because it's conflated with the PQ API.
>>
>> Ease on consumption of our APIs is very important, so, only when
>> performance clearly warrants it should we adopt a more complex API.
>>
>> Also, yeah, it would suck to have to switch back to the old API at
>> this point, but net/net I still think we should if performance is no
>> better with the new one.
>>
>> The old API also fits cleanly with per-segment searching (John's
>> initial patch shows that -- it's simply another per-segment Colletor).
>> The two APIs (collection, comparator) are well decoupled.
>>
>> So, digging in deep and running thorough perf tests makes sense; we
>> need to understand the performance to make the API switch decision.
>> And definitely we should tune both approaches as much as possible
>> (removing that if from the Multi PQ patch makes sense).
>>
>> But... Multi PQ's performance isn't better in many cases... though,
>> we're clearly still iterating.  I'll run a 1.5 (32 & 64 bit) test,
>> with the if statement removed.
>>
>> Mike
>>
>> On Fri, Oct 23, 2009 at 3:53 AM, Earwin Burrfoot <earwin@gmail.com> wrote:
>>
>>> I did.
>>>
>>> On Fri, Oct 23, 2009 at 09:05, Jake Mannix <jake.mannix@gmail.com> wrote:
>>>
>>>> On Thu, Oct 22, 2009 at 9:58 PM, Mark Miller <markrmiller@gmail.com>
wrote:
>>>>
>>>>> Yes - I've seen a handful of non core devs report back that they
>>>>> upgraded with no complaints on the difficulty. Its in the mailing list
>>>>> archives. The only core dev I've seen say its easy is Uwe. He's super
>>>>> sharp though, so I wasn't banking my comment on him ;)
>>>>>
>>>> Upgrade custom sorting?  Where has anyone talked about this?
>>>>
>>>> 2.9 is great, I like the new apis, they're great in general.  It's just
this
>>>> multi-segment sorting we're talking about here.
>>>>
>>>>   -jake
>>>>
>>>>
>>>>
>>>
>>> --
>>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>> ICQ: 104465785
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message