lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gwmail.syr.edu>
Subject Re: positional token info
Date Tue, 21 Oct 2003 14:46:31 GMT
Erik,

I've submitted a patch (BUG# 23730) very similar to yours, in response 
to a request to fix phrases matching where they should not:

<URL:http://mail-archive.com/lucene-user@jakarta.apache.org/msg04349.html>

Bug #23730:
<URL:http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23730>

 > But, how would you actually *use* an index that was indexed with the
 > holes noted by > 1 position increments?

As the lucene-user email linked above notes, setting the position 
increment interdicts false phrase matching.

Steve Rowe

Erik Hatcher wrote:
> On Tuesday, October 21, 2003, at 03:36  AM, Pierrick Brihaye wrote:
> 
>> The basic idea is to have several tokens at the same position (i.e. 
>> setPositionIncrement(0)) which are different possible stems for the 
>> same word.
> 
> 
> Right.  Like I said, I recognize the benefits of using a position 
> increment of 0.
> 
>>> I certainly see the benefit of putting tokens into zero-increment 
>>> positions, but are increments of 2 or more at all useful?
>>
>>
>> Who knows ? I may be interesting  to keep track of the *presence* of 
>> "empty words", e.g. "[the] sky [is] blue", "[the] sky [is] [really] 
>> blue", "[the] sky [is] [that] [really] blue". The traditionnal 
>> reduction to "sky blue" is maybe over-simplistic for some cases...
> 
> 
> But, how would you actually *use* an index that was indexed with the 
> holes noted by > 1 position increments?
> 
>     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message