harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Garner <Robin.Gar...@anu.edu.au>
Subject Re: [arch] VMCore / Component Model
Date Thu, 22 Sep 2005 08:24:05 GMT
On Tue, 2005-09-20 at 08:40 -0400, Geir Magnusson Jr. wrote:

>> On Sep 20, 2005, at 12:26 AM, Robin Garner wrote:
>>> > I think it's important not to take Tim's point about performance too
>>> > lightly here.  There are some key interfaces between components that
>>> > can't afford the overhead of a function call, let alone an indirect
>>> > call via a function pointer.
>>> >
>>> > Three instances that come to mind are:
>>> > - Allocation.  For best performance, the common case of a new() needs
>>> >   to be inlined directly into the compiled code.
>>> > - Write barrier (and read barrier if we need one).  The common case
>>> >   of a write barrier should be a handful of instructions.
>>> > - Yield points.  Again should inline down to a couple of instructions.
>>> > I'd be interested in any approaches you may have thought of for these
>>> > interfaces.
>> Are these things the components would worry about, or can an  
>> optimizer deal with these "later" if possible?

I believe these are so performance critical that a design needs to be in place up front. 
If Harmony is to be competitive in performance terms with the current production VMs, we need
to make sure that allocation, barriers, locking etc are as fast as possible.  A design that
adds an
extra memory reference to one of these operations would not be competitive.

>> Now, I'm really just making this up as I go along... what some people  
>> call "thinking out loud", but I won't grace this with the term  
>> "thinking".  I've never written or been inside VM internals, so I'm  
>> inventing out of whole cloth here...
>> In earlier discussions of componentization, I kept imagining a model  
>> where we have a defined set of capabilities that a component could  
>> optionally implement.  There would be a required set, as well as  
>> optional ones.  (This thinking is inspired by the old MSFT COM  
>> infrastructure...)
>> In the case of a memory manager, a required one would be the  
>> "interface" containing the function pointers for a memory management.
>> An optional one would be "NativeInlineConversion" or something, where  
>> an optimizer could find a new() and if it doesn't support the native  
>> inlineing, use the function call into the component, and if it does,  
>> ask the component for the bit of complied code to use.

I think this is probably do-able, although for some things it would be reasonable to require
the component to provide the inlineable code.  The 'native code' passed back would have to
be one level of compiler IR (internal representation).  Unfortunately this has the side effect
making the compiler's IR public, breaking the modularity of the compiler(s) and to an extent
distributing chunks of the compiler into other components.  This is much less of a problem
when doing Java in Java, but that's another thread.

Actually, the more I think about this, the less I see the value in designing a component structure
that allows run-time configurability.  I believe compile-time configurability is the thing
to aim for.  Achieving good levels of abstraction without sacrificing performance is a difficult
enough job at compile time in any case.

Consider for example the design of the object model, and say that you want to support:
1) A real-time oriented incremental garbage collector
2) A mark-sweep collector for memory constrained environments
3) A generational reference counting collector as the general purpose
high performance configuration.

Collector 1) (if say you used a Brooks-style barrier) requires an additional word in the object
header, to store a forwarding pointer for all objects, alive and dead.  2) requires only 1
bit in the object header (for a mark bit) and could conceivably steal a low-order bit from
the class pointer (TIB in JikesRVM terms).  3) requires an extra word (or at least several
extra bits) in mature objects, and gets better performance if nursery objects don't need that
extra word.

The object header also needs to store metadata associated with locking, address-based hashing
and the TIB (pointer to per-class information).  

There is a complex tradeoff between size and encoding of information in the header word (see,
for example, http://www.research.ibm.com/people/d/dgrove/papers/ecoop02.html), and since different
layouts are possible depending on the different implementations selected.

Much of the code in the runtime system needs to know how the object headers are laid out,
and to access the metadata critical to them very rapidly.  For example, a thin lock takes
5-10 instructions to acquire (in the frequent case).  Adding (for example) a table lookup
to find where in the object header the system had dynamically encoded the lock field would
be disastrous for locking performance.

So at least for modules that are critical to the performance of compiled code, I believe runtime
configurability will be dogged with performance problems.  Compilers, classloaders etc - fine,
but the core runtime components - no.


View raw message