lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Wellnhofer <>
Subject Re: [lucy-dev] Brittle struct ABI proof-of-concept
Date Wed, 09 May 2012 13:13:32 GMT
On 09/05/2012 03:33, Marvin Humphrey wrote:
> On Tue, May 8, 2012 at 4:26 AM, Nick Wellnhofer<>  wrote:
>> Now suppose that a Dog object is passed to a function which accepts an
>> Animal parameter. This means we have to pass dog->superself which is an
>> Animal struct with a pointer to the Dog MetaClass. The function then invokes
>> Animal_speak which is overridden by Dog_speak in Dog.  This will use the
>> method pointers, offsets and fixups of the Dog MetaClass. But using the zero
>> fixup, the Animal object will be passed to Dog_speak which is wrong.  In
>> this case we'd need a negative fixup to "cast" the Animal object back to a
>> Dog object.
> Yep, that's a problem.  We can fix it by interleaving the secondary fixups
> with our method pointers, at a cost of increasing the per-method space
> requirements for MetaClass objects.

Unfortunately, secondary fixups aren't enough. We need different fixups 
for every superclass that an object might be cast to.

>     static inline void
>     Dog_speak(Dog *self) {
>         const uint64_t offsets = Dog_speak_OFFSETS;
>         char *const address = *(char**)self + (uint32_t)offsets;
>         Dog_speak_t method = *((Dog_speak_t*)address);
>         ptrdiff_t fixup = *((ptrdiff_t*)(address + sizeof(void*))  //<------
>                           + (int32_t)(offsets>>  32);
>         void *const view = (char*)self + fixup;
>         method(view);
>     }
> Reference:
>      Bjarne Stroustrup: "Multiple Inheritance for C++"
>      5.1 Implementation
>      On entry to C::f, the this pointer must point to the beginning of the C
>      object (and not to the B part). However, it is not in general known at
>      compile time that the B pointed to by pb is part of a C so the compiler
>      cannot subtract the constant delta(B). Consequently delta(B) must be
>      stored so that it can be found at run time. Since it is only used when
>      calling a virtual function the obvious place to store it is in the table
>      of virtual functions (vtbl). For reasons that will be explained below the
>      delta is stored with each function in the vtbl so that a vtbl entry will
>      be of the form†:
>          struct vtbl_entry {
>              void (*fct)();
>              int delta;
>          };
>> It's possible to define separate fixups for the Animal struct inside the Dog
>> object, but this would further complicate the MetaClass initialization.
> The fixups don't have to be stored inside instantiated Dog objects, they can
> go in the MetaClass.  It does make MetaClass initialization slightly more
> complicated, but I think it's workable.

Yes, I meant to store the fixups in the MetaClass, but as mentioned 
above we have to store different fixups for every superclass/method 

>> Another thing I'm wondering about is how casting of objects to superclasses
>> would work with the scheme you proposed. Casting to the parent class using
>> ->superself is easy. But I think we'll need an additional mechanism for
>> casts to superclasses further up in the hierarchy.
> Right, we need something similar to the C++ dynamic_cast operator and Java
> typecasts.  I think we can get away with two new macros per instantiable
> class, plus one new global function.
>      void*
>      Core_cast(void *object, MetaClass *source, MetaClass *target) {
>          if (object) {
>              MetaClass  *metaclass = ((Object*)object)->metaclass;
>              MetaClass **ancestors = metaclass->ancestors;
>              for (size_t i = 0, max = metaclass->num_ancestors; i<  max; i++) {
>                  if (ancestor == target) { // conversion is valid
>                      ptrdiff_t fixup = source->obj_alloc_size
>                                        - target->obj_alloc_size;
>                      return ((char*)self) + fixup;
>                  }
>              }
>          }
>          return NULL;
>      }
>      #define Object_METACLASS cOBJECT
>      #define OBJECT_CAST(self, type) \
>          ((type*)Core_cast(self, cOBJECT, type ## _METACLASS))
> Here's sample usage involving both downcasting (Object to Boxer) and upcasting
> (Boxer to Dog):
>      int
>      Boxer_equals(Boxer *self, Object *other) {
>          Boxer *twin = OBJECT_CAST(other, Boxer);
>          if (!twin)                                 { return false; }
>          if (strcmp(self->color, twin->color) != 0) { return false; }
>          Dog_equals_t super_equals
>              = SUPER_METHOD_PTR(cBOXER, Dog_equals);
>          return super_equals(BOXER_CAST(self, Dog), other);
>      }
> (Note that the usage of SUPER_METHOD_PTR is slightly different from how we use
> it in Lucy at present -- if we had followed current Lucy usage patterns, we
> would have cast the method pointer to Boxer_equals_t rather than Dog_equals_t.
> We use Dog_equals_t instead because it matches up more cleanly with the return
> type of BOXER_CAST(self, Dog).)
> Since each FOO_CAST macro requires a particular input type, this is an
> improvement in type safety compared to a C typecast, at a runtime CPU cost
> (same as C++ dynamic_cast).

Hmm, that would provide type safety for downcasts, but it doesn't look 
very efficient, considering that we have to downcast/upcast for every 
VArray or Hash access, for example.

I've been thinking more about inline functions or macros that simply 
apply a fixup from a global to an object. Something like that:

     static inline Animal*
     Boxer_As_Animal(Boxer *self) {
         char *view = (char*)self + Boxer_As_Animal_Fixup;
         return (Animal*)view;

This would require even more global variables, though.

All things considered, I'm not sure if prepending the data of subclasses 
in front of the object is worth all the trouble.


View raw message