lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Wiegand <michael.wieg...@lsv.uni-saarland.de>
Subject Re: index enforcing query terms to appear within the same sentence
Date Fri, 04 Mar 2011 14:40:37 GMT
Thank you for all these useful hints!

If I use the multi-valued fields in combination with "modified" position 
increments, I would actually distort the shape of a document.
For instance, if I would like to compare a retrieval enforcing query 
term co-occurrence within the same sentence with a co-occurrence using 
PhraseQuery (or SpanNearQuery) just enforcing that the query terms have 
to appear within a text window of n words (also allowing this window to 
cross sentence boundaries), I would need to create another index where I 
do not modify the position increments.
Is that right?

Best,
Michael

Ian Lea schrieb:
> You can use multi valued fields if you play with the position
> increment gap.  See e.g.
> http://lucene.472066.n3.nabble.com/Problem-searching-in-the-same-sentence-td1501269.html
>
> A google search for "lucene indexing sentences" or similar finds that, and more.
>
>
> Different docs can have different fields/different numbers of fields,
> but the position gap approach is probably better.
>
>
> --
> Ian.
>
>
> On Fri, Mar 4, 2011 at 7:06 AM, Michael Wiegand
> <michael.wiegand@lsv.uni-saarland.de> wrote:
>   
>> Hi,
>>
>> I would like to create an index with Lucene to a document collections of
>> text files.
>> The index should be created in such a way, that for the search I can enforce
>> that query term A and query term B are contained within the same sentence.
>>
>> How should implement the index? Should I have for every sentence a different
>> field (but make sure that it is not a multi-valued field because they would
>> get merged which is exactly what I do not want)?
>> Would it be problematic that different documents would then end up having
>> different numbes of fields?
>>
>> Thank you in advance!
>>
>> Best,
>> Michael
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message