lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Fri, 14 Nov 2003 20:32:53 GMT
I'm still confused what the issue here is.  If you're interested in 
stopping exact phrase matching from crossing sentence boundaries, that's 
easy to do with setPositionIncrement().  If you want to score an exact 
phrase match higher than a sloppy phrase match, and in turn score this 
higher than a non-phrasal match, this is all easy to do with Lucene. 
Depending on how much control you need, you could do it with a single 
sloppy phrase query (perhaps with a custom Similarity implementation), 
or perhaps you'd need to combine an OR query of the terms with an exact 
phrase match and with a sloppy phrase match, each with different boosts.

Certainly there are lots of scoring algorithms that one cannot easily 
implement with Lucene.  I'm just not yet clear on what you need to do 
that Lucene cannot support.

Cheers,

Doug

Chong, Herb wrote:
> then i wouldn't have typed capital gains tax. there is psychology of query creation too
and that is one thing i am taking advantage of.
> 
> Herb....
> 
> -----Original Message-----
> From: Doug Cutting [mailto:cutting@lucene.com]
> Sent: Friday, November 14, 2003 3:15 PM
> To: Lucene Users List
> Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?]
> 
> 
> Have sentence boundaries actually proven to be that userful in this sort 
> of thing.  For example, if the text were something like:
> 
>    "Such sales would be considered long term capital gains.  Tax on 
> these is 20%."
> 
> Then penalizing for the sentence boundary wouldn't be valid.
> 
> Doug
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message