lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: alternative scoring algorithm for PhraseQuery
Date Thu, 18 Oct 2007 01:44:35 GMT
Hi Philipp,

At 10:49 pm +0100 3/7/07, Paul Elschot wrote:
>On Wednesday 07 March 2007 18:12, Philipp Nanz wrote:
>>  Thanks for your answers. Your input is really appreciated :-)
>>
>>  @Paul Elschot:
>>  Thanks for the hint. I guess I could use coord() to penalize missing
>>  terms like this:
>>
>>  Query: a b c d
>>  Doc A: a b c d => sloppyFreq(0) * coord(4, 4) = 1
>>  Doc B: a b c => sloppyFreq(0) * coord(3, 4) = 0,75
>>
>>  Doc would score higher. I guess that might be a valid solution.
>>
>>  There is a drawback though, i.e. sloppyFreq(1) * coord(4, 4) = 0,5
>>
>>  So a perfect match with one insertion would score less than a 3 of 4
>>  match with no slop.
>
>Your examples are based on DefaultSimilarity.
>With a  Similarity in your Scorer you can leave the tradeoff between these
>factors to the user of your query by letting them provide the Similarity
>at query time.

[snip]

I'm curious if Paul's input here helped you finish your 
FuzzyPhraseQuery (or FuzzySpanQuery) addition to Lucene.

Thanks,

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"If you can't find it, you can't fix it"

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message