lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chuck Williams" <ch...@manawiz.com>
Subject RE: Relevance percentage
Date Thu, 23 Dec 2004 08:56:28 GMT
Gururaja,

If you want to score based solely on coord(), then Paul's approach looks
best.  However, based on your earlier messages, it looks to me like you
want to score based on all factors (with coord boosted as Paul
suggested, or lengthNorm flattened as I suggested -- either will get the
order you want in the example you posted), but you want to print the
(unboosted) coord percentage along with each result in the result list.

If this is the case, since the number of results per page on the result
list is presumably small, I think you are best off replicating the
explain() mechanism.  I don't have the source code, but you can look at
IndexSearcher.explain(), which recreates the weight with Query.weight(),
then calls what in this case will be
BooleanQuery.BooleanWeight.explain(), which has the code to recompute
coord on a result (specifically it computes overlap and maxoverlap and
then calls Similarity.coord()).  You could cut and paste this code to
just compute coord for your top-level BooleanQuery's.

Sorry I don't have source code to do this, but the approach should work.
Good luck,

Chuck

  > -----Original Message-----
  > From: Paul Elschot [mailto:paul.elschot@xs4all.nl]
  > Sent: Wednesday, December 22, 2004 11:59 PM
  > To: lucene-user@jakarta.apache.org
  > Subject: Re: Relevance percentage
  > 
  > On Thursday 23 December 2004 08:13, Gururaja H wrote:
  > > Hi Chuck Williams,
  > >
  > > Thanks much for the reply.
  > >
  > > >If your queries are all BooleanQuery's of
  > > >TermQuery's, then this is very simple. Iterate down the list of
  > > >BooleanClause's and count the number whose score is > 0, then
divide
  > > >this by the total number of clauses. Take a look at
  > > >BooleanQuery.BooleanWeight.explain() as it does this (along with
  > > >generating the rest of the explanation). If you support the full
  > Lucene
  > > >query language, then you need to look at all the query types and
  > decide
  > > >what exactly you want to compute (as coord is not always well-
  > defined).
  > >
  > > We are supporting full Lucene query language.
  > >
  > > My request is, assuming queries are all BooleanQuery please
  > > post the implementation source code for the same.  ie to calculate
the
  > coord() method input parameters overlap and maxOverlap.
  > 
  > I don't have the code, but I can give an overview of possible
  > steps:
  > 
  > First inherit from BooleanScorer to implement a score() method that
  > returns only the coord() value (preferably a precomputed one).
  > Then inherit from BooleanQuery.BooleanWeight to return the above
  > Scorer.
  > Then inherit from BooleanQuery to use the above Weight in
createWeight().
  > Then inherit from QueryParser to use the above Query in
  > getBooleanQuery().
  > Finally use such a query in a search: the document scores will be
  > the coord() values.
  > 
  > Regards,
  > Paul Elschot.
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message