incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David E. Wheeler" <da...@kineticode.com>
Subject Re: [lucy-dev] A Schema for PGXN
Date Fri, 18 Mar 2011 22:24:55 GMT
On Mar 18, 2011, at 2:36 PM, Marvin Humphrey wrote:

> Here's how I would express your schema in code:
> 
>    my $schema = Lucy::Plan::Schema->new;
>    my $polyanalyzer  = Lucy::Analysis::PolyAnalyzer->(language => 'en');
>    my $fulltext_type = Lucy::Plan::FullTextType(
>        analyzer      => $polyanalyzer,
>        highlightable => 1,             # maybe
>    );
>    $schema->spec_field(name => 'Title',    type => $fulltext_type);
>    $schema->spec_field(name => 'Abstract', type => $fulltext_type);
>    $schema->spec_field(name => 'Content',  type => $fulltext_type);
>    my $pipe_toker = Lucy::Analysis::RegexTokenizer->new(pattern => '[^|]+'); 
>    my $pipe_type  = Lucy::Plan::FullTextType->new(analyzer => $pipe_toker);
>    $schema->spec_field(name => 'Tags',     type => $pipe_type);
>    $schema->spec_field(name => 'Metadata', type => $pipe_type);
> 
> I think that's the most straightforward way to start out.  From there, you can
> tweak and try other options as necessary.

Thanks. I'm using KS, though. It's the same interface, right?

>> So for those fields that don't apply to a thing, like "tags" for a tag
>> object, I'd just provide no value. Otherwise, I'd like to do a full-text
>> search on all these fields.
> 
> The default behavior of Lucy's QueryParser is to search all indexed fields.
> The weighting's going to get a little weird with the Tags and Metadata fields
> because of length normalization, but that's something to wrestle with later.

I don't understand what that means, sorry.

David



Mime
View raw message