lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] OFFSET globals
Date Tue, 24 Apr 2012 23:32:04 GMT
On Tue, Apr 24, 2012 at 2:21 PM, Nathan Kurz <> wrote:
> On Tue, Apr 24, 2012 at 4:32 AM, Marvin Humphrey <> wrote:

>>> 4) In all other cases (removal or signature change of an inherited
>>> method) we just want to detect the incompatibility at start time and
>>> die with an appropriate error message rather than randomly crashing or
>>> doing the wrong thing.
>> +0.  Nice if possible, but from the perspective of an existing compiled
>> extension, it's hard to detect when upstream has broken an ABI.
> Difficult, but I think as necessary as the other requirements.  If we
> don't do this, we have no way of knowing whether the system is stable,
> or just hasn't yet encountered a broken pathway.  The main library
> needs increment something every time an internal ABI changes, and this
> needs to be checked by the extension.   This is one of the places I
> hope we can use DSO symbol versioning, so this check would happen
> automatically at runtime linkage.

The general solution is certainly for the upstream author to signal ABI
breakage via version number changes.  What I don't know how to do is automatic
detection of accidental or irresponsible ABI breakage -- we *have* to rely on
author-supplied version numbers.

>> 6) It should be possible to remove methods from upstream which are not marked
>> as "public.
> What would be the desired behaviour when a compiled extension
> references a removed method?

A compiled extension should not reference a non-public method.  When the
build environment supports it, we should avoid exporting such symbols.  If a
non-public symbol gets exported despite our efforts and an extension author
references it, a runtime error when symbol relocation or bootstrapping fails is

The main point here is that we should only propagage OFFSET vars into
extension DSOs when those OFFSET vars correspond to *public* methods.  That
way, our automatically generated bootstrapping routines will not break when a
non-public method goes away.

>> Then, after we finish dealing with methods, the next step is to give upstream
>> the ability to add, remove or rearrange member variables. :)
> Can this wait for the second round, or should this be solved now as well?

It is a separate issue.  I have some ideas, though.

> For reference, here's a followup from Ian Wienand.  I don't yet
> comprehend his suggestion regarding libbfd.
> Ian Wienand replies:
>> Nathan Kurz writes:
>>    Unfortunately, the externally compiled Boxer_vtable has a fixed layout,
>>    and it puts Boxer_drool in the slot where the core expects to find eat().
>>    When the core tries to call eat() on a Boxer object, chaos will ensue.
> Yeah; aren't you also relying on the compiler laying out the
> structures the same, independent of you adding a function?

In Lucy's C code, we certainly rely on struct equivalence, make heavy use of
type punning and violate strict aliasing rules -- but the problem Ian is
referring to here has never bitten us.  We are at least aware enough of the
vagaries of struct padding to have avoided getting into trouble so far. :)

In the case of the Boxer_vtable and Dog_vtable sample pseudo-code, I omitted
types for brevity, apparently at the expense of clarity. Those vtables
would have been arrays of function pointers rather than structs.

       cfish_method_t Dog_vtable[] = {

       cfish_method_t Boxer_vtable[] = {

       // After revising "Dog"...

       cfish_method_t Dog_vtable[] = {

I would consider it a surprising result if the compiler were to lay those
arrays out incompatibly.  :)

>> Our goal is to figure out a way to leverage the runtime linker to do
>> the lifting work so we don't have to do our own bootstrapping.  Likely
>> this will be by specifying the VTable lengths (offsets) as a variable
>> within the .so rather than as hardcoded.  But fully understand that
>> this *might* not be the best use of your time today.  Thanks for the
>> response!
> I don't think the dynamic linker is going to help you much with this
> ... it seems you really need to version "Dog" and have "Boxer" tell
> the core what version of "Dog" it implements.  Something you could
> consider is stamping this info into a separate section of the .so with
> something like
> #define DOG_VERSION(version)
>  const char __dog_version __attribute__((section(""))) =
> "dog_version =" #version
> in your "boxer" plugin, and then use libbfd to read and parse that
> section at load time, and hence know what members to call.

It was generous of Ian to code up that sample for us.

"libbfd" is an object file parser; we could use it to extract DSO symbol
versions (on some platforms) and then change the behavior of the bootstrapping
routine on the fly based on the versioning info extracted from the extension.

The real problem we have been working on here is not really related to our
ability to version symbols -- it is that it is tough for the upstream author
to keep track of when the ABI has actually been broken.  The upstream author
has to realize, "Oops, I moved the first declaration of Do_Stuff() up into
BasicObj from Obj, so now I need to increment the ABI version."  Such a system
is guaranteed to fail frequently due to "human error" because it asks too much
of humans.

Fortunately, the design that we've hashed out renders that unnecessary -- by
removing the mechanism that would cause the ABI to break in such subtle ways.
Now, we just rely on the upstream author to change up the version number when
they have done something obvious to break the ABI, such as changing a method

Marvin Humphrey

View raw message