lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kurz <>
Subject Re: [lucy-dev] OFFSET globals
Date Thu, 26 Apr 2012 20:10:36 GMT
On Wed, Apr 25, 2012 at 4:03 PM, Marvin Humphrey <> wrote:
> The output of CFC is definitely easier to understand than the code like
> S_bequeath_methods() that does the work. :)  Let's look at some generated
> code.
> [snip]

All wonderfully helpful and clear.  Thanks!

> I wouldn't say that TermQuery "signals" to anyone that it has overridden a
> method.

I think I understand now.   I was confused by the _OVERRIDE functions
that are (were?) used to provide Perl callbacks.  I thought at first
that the existence of this symbol was a signaling mechanism, but now
realize that it's just the name of the wrapper function.  I wonder if
naming _WRAPPER, or _CALLBACK, or something host language specific
like _PERL might be clearer.  Or could just be me.

> The problem that Nick has been working on is how to cut down on the number of
> OFFSET vars.  In order to guarantee that certain seemingly innocuous
> refactoring actions won't break the ABI, the current release of Lucy generates
> a huge number of OFFSET vars -- and they're all exposed as globals in the
> DSO.  Under the new design, we will have far fewer OFFSET vars, and when the
> build environment supports it, they will not be visible as global symbols in
> the DSO.

I think I'm following along now, although I still have some fuzzy
areas.   Thanks for your patience in explaining.   As I read through
the archives, I see myself asking the same questions repeatedly over
many years, with you answering very kindly in each case.

In a previous message, Marvin writes:
> Supporting CPAN/Rubygems/PyPI-style development for compiled extensions is of
> paramount importance, IMO.

I think I'm now understanding the implications of this, which means I
forgot something from the list of requirements:

7) It must be possible to dynamically subclass a core class at runtime.

Currently, this is distinct from loading a C extension compiled as a
shared object, in that when we subclass from a scripting language the
_OFFSET globals are not created.  With our current mechanism, I think
this means that the dynamically created subclasses (and possibly the
dynamically loaded subclasses) are not quite whole:  one can't (I
think) dynamically subclass these subclasses.

I think these should be harmonized, and that there should also be:
7) a. Dynamically created subclasses should be indistinguishable from
and interchangeable with core classes.

As it see it (from my fuzzy vantage point) this would mean that either
we have to avoid reliance on DSO symbols like the _OFFSET variables,
or we have to create these symbols when the subclasses are made at
runtime (with something like libbfd, possibly by writing an ELF file
and and reloading it with dlopen()).  Nick's proposal of doing a hash
table lookup for offsets leads to one solution, and the approach I was
envisioning (intimately twining with the dynamic loader) would be the

Given the difficulty of doing cross-platform in-process symbol table
manipulation, Nick's approach of a runtime initialization with some
sort of find_offset() function seems quite appealing.  But if we want
the extensions to be first class citizens, I think we need to go all
the way and remove reliance on global _OFFSET symbols altogether, and
have all the metadata available from the VTable registry.

(Wow, how's that for oblique.  But I think I'm on the right path here.)

Before thinking about approaches, there is at least one more potential
constraint questions that concerns me:  How important is the ability
to for objects to be able to have a "private" VTable?  Is this a
requirement, or just an implementation detail of the current host
language subclassing using VTable_singleton?


View raw message