lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gururaja H <guru_h...@yahoo.com>
Subject RE: Relevance percentage
Date Tue, 21 Dec 2004 07:06:05 GMT
Thanks much for the reply.

Chuck Williams <chuck@manawiz.com> wrote:The coord() value is not saved anywhere so
you would need to recompute
it. You could either call explain() and parse the result string, or
better, look at explain() and implement what it does more efficiently
just for coord(). If your queries are all BooleanQuery's of
TermQuery's, then this is very simple. Iterate down the list of
BooleanClause's and count the number whose score is > 0, then divide
this by the total number of clauses. Take a look at
BooleanQuery.BooleanWeight.explain() as it does this (along with
generating the rest of the explanation). If you support the full Lucene
query language, then you need to look at all the query types and decide
what exactly you want to compute (as coord is not always well-defined).

I'm on the West Coast of the U.S. so evidently on a very different time
zone from you -- will look at your other message next.

Chuck

> -----Original Message-----
> From: Gururaja H [mailto:guru_hr29@yahoo.com]
> Sent: Monday, December 20, 2004 6:10 AM
> To: Lucene Users List; Mike Snare
> Subject: Re: Relevance percentage
> 
> Hi,
> 
> But, How to calculate the coord() fraction ? I know by default,
> in DefaultSimilarity the coord() fraction is defined as below:
> 
> /** Implemented as overlap / maxOverlap. */
> 
> public float coord(int overlap, int maxOverlap) {
> 
> return overlap / (float)maxOverlap;
> 
> }
> How to get the overlap and maxOverlap value in each of the matched
> document(s) ?
> 
> Thanks,
> Gururaja
> 
> Mike Snare wrote:
> I'm still new to Lucene, but wouldn't that be the coord()? My
> understanding is that the coord() is the fraction of the boolean
query
> that matched a given document.
> 
> Again, I'm new, so somebody else will have to confirm or deny...
> 
> -Mike
> 
> 
> On Mon, 20 Dec 2004 00:33:21 -0800 (PST), Gururaja H
> wrote:
> > How to find out the percentages of matched terms in the
document(s)
> using Lucene ?
> > Here is an example, of what i am trying to do:
> > The search query has 5 terms(ibm, risc, tape, dirve, manual) and
there
> are 4 matching
> > documents with the following attributes:
> > Doc#1: contains terms(ibm,drive)
> > Doc#2: contains terms(ibm,risc, tape, drive)
> > Doc#3: contains terms(ibm,risc, tape,drive)
> > Doc#4: contains terms(ibm, risc, tape, drive, manual).
> > The percentages displayed would be 100%(Doc#4), 80%(doc#2),
80%(doc#3)
> and 40%
> > (doc#1).
> >
> > Any help on how to go about doing this ?
> >
> > Thanks,
> > Gururaja
> >
> >
> > ---------------------------------
> > Do you Yahoo!?
> > Send a seasonal email greeting and help others. Do good.
> >
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------
> Do you Yahoo!?
> All your favorites on one personal page - Try My Yahoo!

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message