lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Relevance and ranking ...
Date Fri, 17 Dec 2004 10:28:08 GMT
The coord() factor of Similarity is what controls a muliplier factor 
for overlapping query terms in a document.  The DefaultSimilarity 
already contains factors that allow documents with overlapping terms to 
get boosted.  Is this not working for you?  You may also need to adjust 
length normalization factors.  Check the javadocs on Similarity for 
details on implementing your own formulas.  Also become familiar with 
IndexSearcher.explain() and the Explanation so that you can see how 
adjusting things affects the details.

	Erik

On Dec 17, 2004, at 3:42 AM, Gururaja H wrote:

> Hi,
>
> How to implement the following ?  Please provide inputs ....
>
>
> For example, if the search query has 5 terms (ibm, risc, tape, drive, 
> manual) and there are 4 matching documents with the following 
> attributes, then the order should be as described below.
>
> Doc#1: contains terms (ibm, drive) and has a total of 100 terms in the 
> document.
>
> Doc#2: contains terms (ibm, risc, tape, drive) and has a total of 30 
> terms in the document.
>
> Doc#3: contains terms (ibm, risc, tape, drive) and has a total of 100 
> terms in the document.
>
> Doc#4: contains terms (ibm, risc, tape, drive, manual) and has a total 
> of 300 terms in the document
>
> The search results should include all three documents since each has 
> one or more of the search terms, however, the order should be returned 
> as:
>
> Doc#4
>
> Doc#2
>
> Doc#3
>
> Doc#1
>
> Doc#4 should be first, since of the 5 search terms, it contains all 5.
>
> Doc#2 should be second, since it has 4 of the 5 search terms and of 
> the number of terms in the document, its ratio is higher than Doc#3 
> (4/30). Doc#3 has 4 of the 5 terms, but its ratio is 4/100.
>
> Doc#1 is last since it only has 2 of the 5 terms.
>
>
>                                                                   ----
>
> Thanks,
> Gururaja
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message