lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-2504) sorting performance regression
Date Wed, 01 Sep 2010 12:21:54 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-2504:
---------------------------------------

    Attachment: LUCENE-2504.patch

OK I implemented Yonik's suggestion here: the comparator may now
return a new segment-specific FieldComparator on each call to
.setNextReader.  I fixed all FieldComparators to simply "return this",
except for the TermOrdValComparator which returns a comparator
specialized to the bit-width of the packed ints doc->ord mapping for
the fixed-array (8, 16, 32) cases.

This is all quite silly: we are only doing this to "game" hotspot into
properly inlining/compiling what is in fact an array lookup, just
currently hidden behind method calls in the packed ints impls.  We
really "shouldn't have to" do this custom source code specialization.

And, I think a more general framework for source-code specialization
is a cleaner way to minimize hotspot unpredictability (LUCENE-1594),
in the future.  Maybe once we cutover to that, we can remove these
cases of custom specialization in Lucene's core (the 12 private
inner Collector impls in TopFieldCollector is another example).

Here are the results, comparing 3.x perf to trunk w/ the attached
patch -- all runs include the pending [separate] fix on LUCENE-2631:

Optimized index:

||Query||country||unique10||unique100||unique1K||unique10K||unique100K||unique1M||score||
|<all>|{color:red}8.5%{color}|{color:red}8.5%{color}|{color:red}8.4%{color}|{color:red}8.7%{color}|{color:red}8.7%{color}|{color:red}8.4%{color}|{color:red}9.4%{color}|{color:green}10.7%{color}|
|+united +states|{color:red}1.8%{color}|{color:green}0.6%{color}|{color:green}0.3%{color}|{color:green}0.4%{color}|{color:red}0.9%{color}|{color:red}0.7%{color}|{color:red}2.1%{color}|{color:green}2.9%{color}|
|"united states"|{color:green}5.2%{color}|{color:green}5.5%{color}|{color:green}5.7%{color}|{color:green}5.2%{color}|{color:green}5.2%{color}|{color:green}4.8%{color}|{color:green}6.9%{color}|{color:green}7.1%{color}|
|states|{color:red}4.6%{color}|{color:red}4.8%{color}|{color:red}4.1%{color}|{color:red}5.2%{color}|{color:red}5.1%{color}|{color:red}7.0%{color}|{color:red}3.8%{color}|{color:green}1.8%{color}|
|unite*|{color:red}2.0%{color}|{color:red}1.7%{color}|{color:red}3.0%{color}|{color:red}2.6%{color}|{color:red}2.4%{color}|{color:red}5.7%{color}|{color:red}6.0%{color}|{color:green}3.0%{color}|
|united states|{color:red}0.5%{color}|{color:red}0.4%{color}|{color:green}2.8%{color}|{color:green}2.6%{color}|{color:green}3.1%{color}|{color:green}2.1%{color}|{color:red}1.1%{color}|{color:green}2.0%{color}|


Multi-segment index (5% deletions):

||Query||country||unique10||unique100||unique1K||unique10K||unique100K||unique1M||score||
|<all>|{color:red}10.0%{color}|{color:red}10.2%{color}|{color:red}10.1%{color}|{color:red}9.4%{color}|{color:red}9.4%{color}|{color:red}10.1%{color}|{color:red}10.0%{color}|{color:green}5.1%{color}|
|+united +states|{color:red}7.2%{color}|{color:red}7.5%{color}|{color:red}7.7%{color}|{color:red}8.5%{color}|{color:red}8.4%{color}|{color:red}7.1%{color}|{color:red}5.4%{color}|{color:red}1.9%{color}|
|"united states"|{color:green}4.5%{color}|{color:green}4.2%{color}|{color:green}4.0%{color}|{color:green}3.8%{color}|{color:green}4.5%{color}|{color:green}4.3%{color}|{color:green}3.7%{color}|{color:green}4.2%{color}|
|states|{color:red}6.5%{color}|{color:red}8.6%{color}|{color:red}7.3%{color}|{color:red}6.9%{color}|{color:red}7.5%{color}|{color:red}9.4%{color}|{color:red}9.9%{color}|{color:red}1.3%{color}|
|unite*|{color:red}4.5%{color}|{color:red}5.3%{color}|{color:red}4.3%{color}|{color:red}3.9%{color}|{color:red}4.5%{color}|{color:red}4.7%{color}|{color:red}4.7%{color}|{color:red}0.4%{color}|
|united states|{color:red}4.6%{color}|{color:red}2.4%{color}|{color:red}3.2%{color}|{color:red}3.4%{color}|{color:red}1.9%{color}|{color:red}4.8%{color}|{color:red}3.3%{color}|{color:red}1.9%{color}|

So... this fix does make up much of the difference; we still seem to
be a bit (single digits) slower, but, I think this is acceptable given
the massive reduction in RAM required for the FieldCache entry.


> sorting performance regression
> ------------------------------
>
>                 Key: LUCENE-2504
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2504
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Yonik Seeley
>             Fix For: 4.0
>
>         Attachments: LUCENE-2504.patch, LUCENE-2504.zip
>
>
> sorting can be much slower on trunk than branch_3x

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message