lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Indexing synonyms for multiple words
Date Mon, 02 Mar 2009 16:41:22 GMT

Since Lucene doesn't represent/store end position for a token, I don't  
think the index can properly represent SYN spanning two positions?

I suppose you could encode this into payloads, and create a custom  
query that would look at the payload to enforce the constraint.

Or, if you switch to doing SYN expansion only at runtime (not adding  
it to the index), that might work.

Mike

Uwe Schindler wrote:

> I think his problem is, that "SYN" is a synonym for the phrase "WORD1
> WORD2". Using these positions, a phrase like "SYN WORD2" would also  
> match
> (or other problems in queries that depend on order of words).
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
>> Sent: Monday, March 02, 2009 4:07 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Indexing synonyms for multiple words
>>
>>
>> Shouldn't WORD2's position be 1 more than your SYN?
>>
>> Ie, don't you want these positions?:
>>
>>    WORD1  2
>>    WORD2  3
>>    SYN 2
>>
>> The position is the starting position of the token; Lucene doesn't
>> store an ending position
>>
>> Mike
>>
>> Sumukh wrote:
>>
>>> Hi,
>>>
>>> I'm fairly new to Lucene. I'd like to know how we can index synonyms
>>> for
>>> multiple words.
>>>
>>> This is the scenario:
>>>
>>> Consider a sentence: AAA BBB WORD1 WORD2 EEE FFF GGG.
>>>
>>> Now assume the two words combined WORD1 WORD2 can be replaced by
>>> another
>>> word SYN.
>>>
>>> If I place SYN after WORD1 with positionIncrement set to 0, WORD2  
>>> will
>>> follow SYN,
>>> which is incorrect; and the other way round if I place it after  
>>> WORD2.
>>>
>>> If any of you have solved a similar problem, I'd be thankful if you
>>> could
>>> share some light on
>>> the solution.
>>>
>>> Regards,
>>> Sumukh
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message