Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: neutral (asf.osuosl.org: local policy)
Date: Wed, 23 Aug 2006 12:44:03 -0700 (PDT)
From: Chris Hostetter <hossman_lucene@fucit.org>
To: java-user@lucene.apache.org
Subject: Re: Scoring Technique based on Relevance Feeback & other Parameters
In-Reply-To: <20060823123108.4D79110FB00B@asf.osuosl.org>
Message-ID: <Pine.LNX.4.58.0608231238360.12489@hal.rescomp.berkeley.edu>
References: <20060823123108.4D79110FB00B@asf.osuosl.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


: package. By implementing new type of  tuple (Query,Weight,Scorer) I can
: easily implement new Scoring technique. Unfortunatly Lucene index shows that
: it stores only TF / Position vectors for each term within document.

:         I am interested in investigating new scoring technique where I will
: use some other parameters relating to the Term to rank the documents. For an
: example web page ranking is assisted by parameters like number of links
: towards webpage and number of link from web - page.  It indicates that we
: need to store relatively more information about terms within the index. But
: HoW ? . I need to investigate

there is a distinction between storing more information about a term and
storing additional information about a document.

the flexible payload type approaches that have been discussed should make
info about a term easy (ie: the term is "wind", it's type is "noun", it's
usage in the sentence is as a "subject", it's importance is "88.3") but
you can already store additional information about documents (like the
total popularity of a document) in Lucene -- either by using the document
boost (if you always want it to be part of the score calculations) or as a
seperate field which you can factor into the score calculations using
something like FunctionQuery...

http://incubator.apache.org/solr/docs/api/org/apache/solr/search/function/package-summary.html

...i use this all the time to make "recent" docs score better, or "more
popular docs" score better.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org