lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dejan Nenov" <>
Subject RE: Scoring Technique based on Relevance Feeback & other Parameters
Date Wed, 23 Aug 2006 15:58:14 GMT
Indeed - you bring up interesting questions. You may want to take a look at
NUTCH first, however - I am not sure if they have done some of the
Google-like ranking you mention.

However - collaborative relevance enhancement, based on user feedback, would
be a nice Web-2.0-ish feature to bake into the Lucene core :-)





From: sachin [] 
Sent: Wednesday, August 23, 2006 5:31 AM
Subject: Scoring Technique based on Relevance Feeback & other Parameters


Hello Great/smart guys 

       This is my first question for this group as I started working on the
Lucene last month. 


        Lucene provide the scoring of documents based on TF-IDF vector
analysis. Lucene also provides the Scorer and Weight inside the Search
package. By implementing new type of  tuple (Query,Weight,Scorer) I can
easily implement new Scoring technique. Unfortunatly Lucene index shows that
it stores only TF / Position vectors for each term within document. 


        I am interested in investigating new scoring technique where I will
use some other parameters relating to the Term to rank the documents. For an
example web page ranking is assisted by parameters like number of links
towards webpage and number of link from web - page.  It indicates that we
need to store relatively more information about terms within the index. But
HoW ? . I need to investigate


        Another parameter is relevance feedback from the User. Ranking
should get affected by relevance feedback from the user. 


Would someone interested in helping out or thinking about the same problem.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message