lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Koch" <TheRan...@gmx.net>
Subject Re: Lucene scoring: coord_q_d factor
Date Thu, 14 Dec 2006 01:35:55 GMT
Hello Paul,

thank you for providing the link to that paper. I read it again, and you are right. I discovered
the following text part:

"In normal term co-ordination matches, if a request and document have a frequent term in common,
this counts for as much as a non-frequent one; so if a request and document share three common
terms, the document is retrieved at the same level as another one sharing three rare terms
with the request. But it seems we should treat matches on non-frequent terms as more valuable
than ones on frequent terms, without disregarding the latter altogether. The natural solution
is to correlate a term's matching value with its collection frequency."

If I do not misunderstand that extract, I would say it suggests the  combination of coordination
level matching with IDF. I am interested in your view and those who read this? 

Are there any other papers that regard the combination of coordination level matching and
TFxIDF as advantageous?  

Cheers,
Karl

-------- Original-Nachricht --------
Datum: Wed, 13 Dec 2006 21:00:45 +0100
Von: Paul Elschot <paul.elschot@xs4all.nl>
An: java-user@lucene.apache.org
Betreff: Re: Lucene scoring: coord_q_d factor

> On Wednesday 13 December 2006 16:42, Karl Koch wrote:
> > Do you know about any papers that discuss this?
> 
> Coordination is called co-ordination In the original idf paper by
> K. Spärck Jones, A statistical interpretation of term specificity
> and its application in retrieval.,  Journal of Documentation 28,
> 11-21, 1972
> http://www.soi.city.ac.uk/~ser/idfpapers/ksj_orig.pdf
> 
> The paper is the first one on the idf page:
> http://www.soi.city.ac.uk/~ser/idf.html
> 
> Regards,
> Paul Elschot
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message