incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Dynamic document boost
Date Sat, 11 Feb 2012 21:18:09 GMT
On Sat, Feb 11, 2012 at 10:03:37PM +0100, Nick Wellnhofer wrote:
> What's the best way to apply a boost factor dynamically to a (small)  
> subset of documents?

I would suggest using a RequiredOptionalQuery.  Have the logical results
depend on the required_query and boost using the optional_query.

    my $parsed_query = $query_parser->parse($user_query_string);
    my $user_id_boost_query = Lucy::Search::TermQuery->new(
        field => 'user_id',
        term  => $user_id,
    );
    $user_id_boost_query->set_boost($arbitrary_boost);
    my $req_opt_query = Lucy::Search::RequiredOptionalQuery->new(
        required_query => $parsed_query,
        optional_query => $user_id_boost_query,
    );

If the query to identify the subset of documents is very expensive, you might
look into using LucyX::Search::Filter to cache the results (but note that
Filter does not cache in a clustered environment).

> Is there a better way than to simply retrieve all the results, apply the  
> boost factor manually to the scores and sort the results again?

I hope you don't have to resort to post-search filtering.  That's slow to
begin with and it doesn't scale very well because of the costs of retrieving
so many documents.  You also have to resort to non-idiomatic sorting code
(using a priority queue rather than the Perl sort() function) if you don't
want memory usage to balloon.

Marvin Humphrey


Mime
View raw message