lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-2504) sorting performance regression
Date Tue, 14 Sep 2010 17:11:34 GMT


Michael McCandless commented on LUCENE-2504:

I think we all owe it to ourselves to stop equating java with Oracle, if Java 
stays with Oracle its pretty obvious the language (is) will die anyway.

Yeah I agree.

The open question is whether this hotspot fickleness is particular to
Oracle's java impl, or, is somehow endemic to bytecode VMs (.NET
included).  It's really a hard, complex problem (JIT compilation from
bytecode based on runtime data), so it wouldn't surprise me if it's
the latter, to varying degrees.

bq. .NET is not a choice but generating C/C++ code is?

As far as I know it's much easier to invoke C/C++ from java, than .NET
from java.  C/C++ is also more portable than .NET, I think?  (There is
Mono -- how mature is it by now?).

I don't think we should jump the gun and make real design/architectural
choices based on Oracle bugs.

I expect source code spec will also buy sizable perf gains
irrespective of hotspot fickleness, and in non-Oracle java impls.
Generating a dedicated class, with one method doing all searching and
collecting, removes all kinds of barriers to the JIT compiler.  It
makes its job far easier.

bq. I agree with robert that we should stop comparing against sun jvms all the time and turn
everything upside-down specializing code here and there or go one step further and generate
C++ code. Dude who is gonna maintain the compatibility to Java-Only environments?

If we manage to pursue specialized code gen, it'll be a loooong time
coming!  My point about C/C++ is that if we do somehow manage to get a
working code gen framework online (for Java), the added cost to make
it also target C/C++ will be "relatively" small.  Ie, it's nearly "for

If we were do to this, that would not mean we'd abandon java, of
course -- the framework would fully support "pure java" as well.

bq. I think that code specializations of very "hot" part of lucene are ok and we should follow
that way like we did at some places but it already make things very complicated to follow.

You mean manual specialization right (like this issue)?

Yes, I think we will have to keep manually specializing, going
forward, until we can have code generator that
does it more cleanly...

bq. Would it make way more sense to push OSS JVMs than spending lots of time on investigating
on .NET as an alternative or C/C++ code generator?

I think we should do both.

bq. Before I would go the C++ path I'd rather use Java to host a C core like lucy which brings
you as close as it gets to the machine.

I think this (a Java wrapper for Lucy) is a great idea -- we should explore that, too.

bq. interesting papers - seems we are touching the limits of Java though.

Well that's the big question -- limits of Java or limit's of Sun/Oracle's impl.

It looks like harmony has a ways to go on absolute performance: I just
ran a very quick benchmark (TermQuery search on 10 M multi-segment
wiki index w/ a 50% random filter) and Oracle java 1.6.0_21 gets 15.6
QPS while Harmony 1.5.0-r946978 gets 9.5 QPS (Harmony 1.6.0-r946981
also gets 9.5 QPS).  I just ran java -server -Xms2g -Xmx2g; it's
possible by tuning Harmony (it has many awesome looking command-line
args!) it'd get faster...

> sorting performance regression
> ------------------------------
>                 Key: LUCENE-2504
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Yonik Seeley
>             Fix For: 4.0
>         Attachments: LUCENE-2504.patch, LUCENE-2504.patch, LUCENE-2504.patch,
> sorting can be much slower on trunk than branch_3x

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message