incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kurz <n...@verse.com>
Subject Re: [lucy-dev] Host overriding of all non-final methods
Date Mon, 07 Mar 2011 08:17:42 GMT
On Sun, Mar 6, 2011 at 9:03 PM, Marvin Humphrey <marvin@rectangular.com> wrote:
> On Fri, Mar 04, 2011 at 11:27:59PM -0800, Nathan Kurz wrote:
>> On Wed, Mar 2, 2011 at 10:30 PM, Marvin Humphrey <marvin@rectangular.com> wrote:
>> > Arguably, we don't even need the "final" keyword.  We'd should benchmark to
>> > confirm my recollection about the performance implications, but I'll bet we
>> > could remove it with no immediate impact on Lucy.
>>
>> I'd suggest this as the cleanest solution.  Intuitively, I'd think
>> that the benefit of 'final' would be very small, such that if one
>> really cares about performance one should inline the function call
>> completely and not worry about saving a single dereference.
>
> Sounds good -- I'll work up a patch.  We'll leave the "inline" keyword in
> Clownfish, but drop the "final" keyword.

Great.  I think you could get away with dropping 'inline' as well.  My
point was not that Clownfish needs to inline things, but that if you
really want to squeeze out the last drop of performance by avoiding
function calls, you'll have to take control of the compilation
yourself, likely by rewriting the entire core class in some unreadable
and unmodifiable fashion.

It would be interesting, though, to someday benchmark the potential
advantage here.  Once you're generating code, it wouldn't be hard to
test generate a monolithic 'final' library as well, with inter-class
inlining.  But if one was to take this route for anything production,
I think it would make more sense as an overall compile time option
rather than a method-by-method keyword: -O lock-it-all-down.

> Looking forward, we'll need to think about how to design our classes and
> interfaces so that time-critical functionality can been inlined whenever
> possible.

While there is some small gain to be had here, I don't think it should
be a priority.  I can be as cycle-count-conscious as anyone, but once
you're memory bound there isn't that much advantage is optimizing
cycles much further.  I think we can get far by keeping the base
architecture fast (as it currently is) and concentrating on data
layout.   To the extent that one does worry about cycles, it's not the
function calls that need to be avoided, rather the mispredicted
branches.  So long as you take the same convoluted path every time,
modern processors are monstrously efficient.

Let's make the Northbridge scream for mercy.

Nathan Kurz
nate@verse.com

Mime
View raw message