lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject [lucy-dev] Miscellaneous API changes (Searcher, HeatMap, etc)
Date Tue, 26 Oct 2010 00:19:15 GMT
Greets,

I expect to commit a few changes to the public API in a bit, and wanted to
document the rationales and open an opportunity for discussion.

First, there are four methods and one constructor which should have been
officially exposed a while ago; we'll correct the oversight and make them
public.

    TermQuery_Get_Term()
    TermQuery_Get_Field()
    Doc_Set_Doc_ID()
    Doc_Get_Doc_ID()
    Arch_new()

Second, Searcher_Fetch_Doc() will be changed to return a HitDoc rather than an
Obj.  I'm also going to remove the "score" and "offset" parameters, leaving
only "doc_id".  Since this brings the argument count down to one, the Perl
bindings will now use a single positional arg instead of labeled params.

    # Before:
    my $doc = $searcher->fetch_doc( doc_id => $doc_id );

    # After:
    my $doc = $searcher->fetch_doc($doc_id);

Similarly, DocReader_Fetch(), which takes three params and returns an Obj, will
be replaced by DocReader_Fetch_Doc(), which takes only a doc_id and returns a
HitDoc.

These Fetch_Doc() changes close the book on an unsuccessful experiment:
allowing the Fetch_Doc() stack to return arbitrary objects.  It was this
ill-advised flexibility which necessitated the addition of parameters to
Fetch_Doc(); we'll be returning to the Lucene model for fetching documents,
which is simpler for doing simple things, yet still makes hard things possible:

    # Before ($doc can be an arbitrary object):
    my $doc = $searcher->fetch_doc(
        doc_id => $doc_id,
        offset => $offset,
        score  => $score,
    );

    # After ($doc isa HitDoc):
    my $doc = $searcher->fetch_doc($doc_id - $offset);
    $doc->set_doc_id($doc_id);
    $doc->set_score($score);

Lastly, I'm going to redact HeatMap (used by Highlighter) as a public class.
It's not complete or polished enough to do what it sets out to do right now.
We need to think its interface through a little more.

Marvin Humphrey



Mime
View raw message