lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Scoring Technique based on Relevance Feeback & other Parameters
Date Wed, 23 Aug 2006 19:44:03 GMT

: package. By implementing new type of  tuple (Query,Weight,Scorer) I can
: easily implement new Scoring technique. Unfortunatly Lucene index shows that
: it stores only TF / Position vectors for each term within document.

:         I am interested in investigating new scoring technique where I will
: use some other parameters relating to the Term to rank the documents. For an
: example web page ranking is assisted by parameters like number of links
: towards webpage and number of link from web - page.  It indicates that we
: need to store relatively more information about terms within the index. But
: HoW ? . I need to investigate

there is a distinction between storing more information about a term and
storing additional information about a document.

the flexible payload type approaches that have been discussed should make
info about a term easy (ie: the term is "wind", it's type is "noun", it's
usage in the sentence is as a "subject", it's importance is "88.3") but
you can already store additional information about documents (like the
total popularity of a document) in Lucene -- either by using the document
boost (if you always want it to be part of the score calculations) or as a
seperate field which you can factor into the score calculations using
something like FunctionQuery...

http://incubator.apache.org/solr/docs/api/org/apache/solr/search/function/package-summary.html

...i use this all the time to make "recent" docs score better, or "more
popular docs" score better.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message