lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-2504) sorting performance regression
Date Tue, 14 Sep 2010 10:21:33 GMT


Michael McCandless commented on LUCENE-2504:

The fickleness of the hotspot compiler is just awful, and, frankely
unacceptable.  Java (Oracle) really needs to do something to address

EG, see my post here:  

In that standalone test, I can get drastically different search
performance depending on what code runs first.  Hotspot gets itself
into a state where it's "stuck" and is not able to re-optimize for the
code that's running.  When I disassembled the methods hotspot had
compiled, one thing I found was that readVInt (the hottest of hot in
Lucene today) was compiled very differently depending on what code ran

The changes we've had to make to Lucene/Solr in this issue to
workaround hotspot are here are horrible -- we've introduced ugly code
dup specializations so that hotspot properly detects given method
calls are in fact just an array lookup.  We've made similar
specializations elsewhere in Lucene...

Weirdly, I've found that running java with -Xbatch gives far more
repeatable results.  This is bizarre because that option forces
compilation to run in the foreground; it's not supposed to alter which
methods hotspot chooses to optimize, and, how much (I think?).  Though
maybe because threads are paused awaiting compilation it alters
hotspots targets?  However, -Xbatch doesn't always give the fastest

Not that we have a choice here... but I've often wondered whether .NET
has this same hotspot fickleness problem.

I think this is a severe and growing problem for Lucene going forward
-- our search performance is crucial and we can't risk hotspot
randomly, substantially slowing things down by alot.  We're unable
to do true performance tuning when hospot "noise" easily dwarfs the
effects we're trying to measure.

I think the only viable option going forward is to create a search
framework that's able to generate its own specialized java code.  We'd
use this, statically, to generate pieces of the search executation
path that we think are common enough to warrent up-front specialization,
but also expose it dynamically so apps can "optimize" for their query
paths, either statically (pre-built/compiled in their apps) or
dynamically (like how a JSP rewrites to java code and is then
compiled).  Of course we'd still retain the non-specialized code, as a
fallback to handle those cases the specializer can't yet cover, or,
for apps where the net bytecode must be kept smallish.

In theory such a search autogen framework could also generate into
C/C++, enabling us to choose a good point to wrap the result with JNI
(eg, TopDocsCollector.topDocs), which'd be wonderful as it'd fully
sidestep the hotspot fickleness.

> sorting performance regression
> ------------------------------
>                 Key: LUCENE-2504
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Yonik Seeley
>             Fix For: 4.0
>         Attachments: LUCENE-2504.patch, LUCENE-2504.patch,
> sorting can be much slower on trunk than branch_3x

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message