lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: multiple instances of fields or attributes
Date Mon, 04 Feb 2008 13:26:20 GMT
Yes, Lucene supports multiple instances of same-named fields.   There  
is one trick you'll need to leverage for the proximity operators to  
work as you expect - positionIncrementGap.

For example, if you index "Doe, John" and "Smith, Fred" as separate  
name field instances on the same document, a phrase query for  
name:"john smith" would match that document.  Setting the position  
increment gap to something greater than your desired phrase slop  
would prevent this.

	Erik


On Feb 3, 2008, at 4:14 PM, André Warnier wrote:

> Hi.
>
> I am totally new to Lucene, and currently investigating the usage of
> Lucene for a new development project. In fact, for evaluation I am  
> using
> the C port of Lucene, through the Perl "Lucene" module.  I believe my
> question is generic, but please tell me if it is otherwise.
> (Please adapt this question to the Java environment if needed, want I
> want to know is the fundamentals of Lucene)
>
> In perl, to add items to the Lucene index, I do sonething like
> my $doc = new Lucene::document;
> $doc->addfield('title','value1');
> $doc->addfield('author','value2');
> $doc->addfield('subject','value3');
> $lucene_writer->addDocument($doc);
> and that works fine.
>
> Now my question is : can I have seperate "instances" of the field
> 'author' in the same document, like
> 'author' = 'Einstein, Albert'
> 'author' = 'Newton, Isaac'
> 'author' = 'Freud, Sigmund'
>
> Could I just do several times
> $doc->addfield('author','name');
> and would Lucene index separate "instances" of this field for the same
> document ?
>
> The reason being that I would like to search something like "Einstein
> Albert"~1  (adjacent), but without finding another document which  
> would
> have a concatenated field like "Thomas, Albert; Einstein, Joseph".
> (The same case occurs for instance for a field "keywords".)
>
> Does this question make sense with Lucene ?
> If the above is not possible, then is this type of case usually  
> handled
> otherwise in Lucene, and how ?
>
> Thank you in advance,
> aw
>


Mime
View raw message