lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Filter + ClusterSearcher/QueryParser/make_compiler
Date Tue, 20 Dec 2011 01:25:47 GMT
On Mon, Dec 19, 2011 at 03:42:02PM +0200, lance bowler wrote:
> I'm trying to wrap my head around how to perform a normal query, plus
> filter out/in results based on a field which has 0|1 (non-indexed,
> stored), something like Google's safesearch, using
> ClusterSearcher/QueryParser/make_compiler.

I would just avoid LucyX::Search::Filter in a Remote context:

    http://incubator.apache.org/lucy/docs/perl/LucyX/Search/Filter.html#BUGS

    Filters do not cache when used in a search cluster with LucyX::Remote's
    SearchServer and SearchClient.

(It doesn't cache because Filter uses a BitVector internally to cache its
result set, and a result set with one bit per indexed document is typically
too big to serialize and send over a network connection.)

> But I'm trying to figure out how to do that when using the following sequence:
> 
>   Lucy::Search::ANDQuery ('field:0|1' presumably) or LucyX::Search::Filter,
>   my $searcher = LucyX::Remote::ClusterSearcher->new...
>   my $query_parser = Lucy::Search::QueryParser->new...
>   my $query_obj = $query_parser->parse($user_query_str);
>   my $query_compiler = $query_obj->make_compiler( searcher => $searcher );
>   $hits = $searcher->hits( query => $query_compiler,...);

Probably something like this:

    my $cluster_searcher = LucyX::Remote::ClusterSearcher->new(
        schema => $schema,
        shards => \@shards,
    );
    my $query_parser = Lucy::Search::QueryParser->new(
        schema => $schema,
        fields => [qw( title content )],
    );

    ...

    my $user_query = $query_parser->parse($user_query_string);
    my $filter = Lucy::Search::TermQuery->new(field => 'foo', term => '1');
    my $and_query = Lucy::Search::ANDQuery->new(
        children => [$user_query, $filter],
    );
    my $weighted_query = $query->make_compiler(searcher => $cluster_searcher);
    my $hits = $cluster_searcher->hits(query => $weighted_query);

For more on working with Queries, see the QueryObjects tutorial chapter:

  http://incubator.apache.org/lucy/docs/perl/Lucy/Docs/Tutorial/QueryObjects.html

> Finally:  *can* I filter on a field which is not indexed (contains a
> simple 0 or 1)?

Nope.  Information which is stored but not indexed is not searchable.  Lucy is
not a traditional database, and it does not provide out-of-the-box support for
"full table scans" of stored data -- it only knows how to search indexed data.

Cheers,

Marvin Humphrey


Mime
View raw message