From Nathan Kurz <>
Subject [lucy-dev] Dependency injection and customizing a scorer
Date Mon, 09 May 2011 05:34:37 GMT
I'd love to spend a little time discussing how one would actually go
about customizing a scorer.  It's not something I'm actually able to
spend time doing right now, but it seems like a good time to start
discussing how it should work and to get others more involved.

For sake of argument, let's do something arbitrary but simple:
presume I want to modify the OrScorer to multiply the child scores
rather than add them.  Please ignore for now that this probably isn't
actually a good idea.

The straightforward but brutish option would be to make a one
character change in ORScorer_score() in ORMatcher.c[*]:
- score += scores[i];
+ score *= scores[i];
Recompile, and I'm done.  Simple and fast, but requires doing it in C
and would require distribution by patch and pray.

If instead I wanted to do it in a Host language, it looks like
Clownfish provides this quite directly:  define a host function, and
then use VTable_override to replace the entry in the 'master' vtable
in the registry for ORMATCHER.  Everything else runs in C, and my
function gets called in Perl, Python, or Ruby.   Really cool!

Or maybe this isn't possible yet.  Is it?  And is there an established
stage where I would do fixups like this?  Is there a callback to do
it, or does one stick it in somewhere before the search?  Is the
registry global, or per Searcher?

But let's assume that for some reason I need to store a little extra
data, and thus want to create an entire MyOrScorer object and have it
used instead of OrScorer[**].    It seems like I would have to modify
ORCompiler_make_matcher() in ORQuery.c.[*]  But it seems error prone
to have to rewrite that whole function in some other language.

Since we already have a registry in place, perhaps we could institute
a simple level of Dependency Injection
( that would make
this easier?  Instead of hard coding it, could we add a function entry
in OrQuery.cfh for or_scorer_new() and let it be easily overridden?

Also, presume I want to prototype that MyOrScorer class in the host
language rather than C. How do I create the C callable VTable for this
Host class?  I think all the pieces are there, but I'm not seeing
quite how it happens.

Finally, let's assume I like my prototype, and then write it in C and
compile it as a shared library.  How does this get loaded and linked
in?  What do I need to do to register its VTable?

Thanks!  I apologize for asking lots of questions in a row.  And I
don't mean to imply that these haven't been considered.  At a glance,
it looks like Lucy is very close to being able to answer all of these
in a pretty rock solid way, which implies some good design choices.


[*] It does seem like all the functions I pick are in files of
different names.   I love the simplicity of Prefix == File.
[**] OK, I'm getting off track here, but why is OR capitalized?  I
keep typing it wrong AND needing to fix it.

