lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject [lucy-dev] Non-deterministic destruction in Perl 5.15
Date Wed, 09 Nov 2011 01:32:46 GMT
Greets,

In Perl 5.15 (current "blead" Perl -- the developer release), Lucy fails most
of its tests because of an exception thrown during global destruction:

    (in cleanup) Insane attempt to destroy VTable for class 'Lucy::Object::Obj'
    lucy_VTable_destroy at /home/sts/cpansmoke/perl-5.15.2/cpan/build/Lucy-0.2.2-o_YHcb/core/Lucy/Object/VTable.c
line 44
    at t/018-host.t line 0

That's a tripwire that I set because VTable's destructor should *never* be
invoked.  We leak VTables on purpose.  

What has changed in Perl 5.15 is that destructors are now called during global
destruction; previously, Perl freed all SVs during global destruction but did
not call DESTROY on objects.  

    http://search.cpan.org/~stevan/perl-5.15.3/pod/perlobj.pod#Global_Destruction

This change to Perl is going to require a corresponding change
to Lucy's Perl bindings.  Consider the following code:

    my %hash = (
        searcher => Lucy::Search::IndexSearcher->new(index => $path),
    );
    $hash{circular_reference} = \%hash;

Because of the circular reference, that Perl hash, the Searcher it refers to,
and crucially, the Searcher's inner PolyReader will not be deallocated until
global destruction.  During global destruction, though, refcounting goes out
the window and destruction order is effectively random.  

What we would ordinarily want to see is destruction moving from the outermost
object to the innermost:

    Perl hash
    IndexSearcher
    PolyReader
    SegReaders
    DataReaders
    InStreams
    FileHandles
    ...

This is important because when we get to the IndexSearcher's destructor, its
subcomponents still need to be valid:

    void
    IxSearcher_destroy(IndexSearcher *self) {
        DECREF(self->reader);
        // ...
        SUPER_DESTROY(self, INDEXSEARCHER);
    }

If self->reader has already been freed when this destructor gets called,
that's bad news -- we're going to be invoking DECREF on freed memory.  

As far as I can tell, the only solution is to disconnect our DESTROY methods
when Perl enters global destruction and leak everything.  Here's sample XS
code to get the point across:

    void
    DESTROY(self)
        lucy_IndexSearcher *self;
    PPCODE:
        if (PL_phase != PERL_PHASE_DESTRUCT) {
            lucy_IxSearcher_destroy(self);
        }

Of course, this defeats the purpose of the change that was made in Perl 5.15.
The rationale for the new behavior is to support situations where for example,
you could guarantee that when a Perl interpreter in an embedded system shuts
down, *everything* gets reclaimed.  But I believe that architecture is only
feasible when you control all memory allocation (as when the OS closes a
process) and thus Perl's new global destruction model is flawed as it cannot
encompass external resources.

I wonder how many other systems like ours are out there that are vulnerable to
this flaw.  Not many CPAN distros are going to have test suites that validate
behavior under refcount leakage.

Marvin Humphrey


Mime
View raw message