incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject [lucy-dev] FieldType Equals() and class membership
Date Thu, 03 Mar 2011 00:07:56 GMT
Greets,

Lucy is not supposed to let you change field definitions for an index.  If you
supply a Schema with a modified field type to an Indexer and attempt to modify
an existing index, Lucy will throw an exception.  You're allowed a new schema
if the "truncate" flag is supplied to Indexer, but in that case there will be
a hard cutover to the new Schema when Indexer_Commit() completes -- there is
never a situation where the same index hosts distinct segments with
conflicting field definitions.

The enforcement happens here, in core/Lucy/Index/Indexer.c:

            // Make sure than any existing fields which may have been
            // dynamically added during past indexing sessions get added.
            Schema *old_schema = PolyReader_Get_Schema(self->polyreader);
            Schema_Eat(schema, old_schema);

However, there is a weakness the definition checking performed by
Schema_Eat(): it relies upon FType_Equals() to detect conflicts, and
FType_Equals() doesn't enforce class membership.  That means that an
incompatible index definition can overwrite another one.

This bug was uncovered when we changed an index to use a new FieldType
subclass that spec'd an alternate posting format. (Alternate posting formats
are an undocumented feature of Lucy.) The new index was written successfully,
displacing the old when Indexer_Commit() finished.  An errant process using
the old schema then modified the index, adding a new segment in an
incompatible format and overwriting the new schema with the old.  Once that
update completed, the index was no longer usable, because the incorrect
posting decoder spec'd by the new FieldType was incorrect with the posting
data in all the old segments -- it read a bunch of junk, then finally yielded
a read-past-EOF error.

As a solution, I believe that FieldType's Equals() method should change to
enforce that both "self" and "other" belong to the same class.  We don't care
*which* class, we only care that they be the *same* class.  So long as we
don't require that that class be Lucy::Plan::FieldType itself, the method is
still inheritable and can be invoked via SUPER.

For some classes, e.g. CharBuf, Equals() should emphatically *not* test for
class membership, so that e.g. a CharBuf and a ViewCharBuf with the same
logical content test as equal.  For FieldType, I can't think of a scenario
where having objects which belong to different classes test as equal would be
desirable.

Patch below.

Marvin Humphrey


Index: ../core/Lucy/Plan/FieldType.c
===================================================================
--- ../core/Lucy/Plan/FieldType.c   (revision 1075667)
+++ ../core/Lucy/Plan/FieldType.c   (working copy)
@@ -87,6 +87,7 @@
 {
     FieldType *evil_twin = (FieldType*)other;
     if (evil_twin == self) return true;
+    if (FType_Get_VTable(self) != FType_Get_VTable(evil_twin)) return false;
     if (self->boost != evil_twin->boost) return false;
     if (!!self->indexed    != !!evil_twin->indexed)    return false;
     if (!!self->stored     != !!evil_twin->stored)     return false;


Mime
View raw message