lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Scoring Technique based on Relevance Feeback & other Parameters
Date Wed, 23 Aug 2006 19:44:03 GMT

: package. By implementing new type of  tuple (Query,Weight,Scorer) I can
: easily implement new Scoring technique. Unfortunatly Lucene index shows that
: it stores only TF / Position vectors for each term within document.

:         I am interested in investigating new scoring technique where I will
: use some other parameters relating to the Term to rank the documents. For an
: example web page ranking is assisted by parameters like number of links
: towards webpage and number of link from web - page.  It indicates that we
: need to store relatively more information about terms within the index. But
: HoW ? . I need to investigate

there is a distinction between storing more information about a term and
storing additional information about a document.

the flexible payload type approaches that have been discussed should make
info about a term easy (ie: the term is "wind", it's type is "noun", it's
usage in the sentence is as a "subject", it's importance is "88.3") but
you can already store additional information about documents (like the
total popularity of a document) in Lucene -- either by using the document
boost (if you always want it to be part of the score calculations) or as a
seperate field which you can factor into the score calculations using
something like FunctionQuery...

...i use this all the time to make "recent" docs score better, or "more
popular docs" score better.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message