lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eugene <echot...@gmail.com>
Subject Re: Help on Similarity
Date Mon, 06 Mar 2006 16:49:59 GMT
With respect to the earlier post there seems to be a bug in lucene 1.9.1

I tried using the similarity below and changed idf to:
  public float idf(int docFreq, int numDocs) {
  float f = (float)(Math.log((double)numDocs/(double)(docFreq+1) + 1.0));
      return f;
    }

Now, when I print the explanantion for the top doc id, it includes every 
term in the query twice with a raw score of 11.50651, when some terms 
don't even appear in any docs. And the max raw score of the top doc is 
only 4.12327.

Anyone encounter this before?

Thanks

Eugene wrote:
> Hi,
> 
> I tried implementing my own Similarity and setting it in 
> IndexWriter.setSimilarity(new CosSimilarity()).
> 
> But, there's something weird, it doesn't seem to call the methods in my 
> Similarity. For example, when I set the idf to return 0.0f the 
> Similarity still gives me a score > 0.0f.
> 
> How do I correctly set the Similarity? I'm quite new to this, some links 
> to implementing Similarity will also be useful.
> 
> Thanks.
> 
> -- 
> Eugene
> 
> Here's the code for my CosSimilarity:
> 
> import org.apache.lucene.search.Similarity;
> 
> public class CosSimilarity extends Similarity
> {
>   public float lengthNorm(String fieldName, int numTerms) {
>     return 1.0f;
>   }
> 
>   public float queryNorm(float sumOfSquaredWeights) {
>     return (float)(1.0 / Math.sqrt(sumOfSquaredWeights));
>   }
> 
>   public float tf(float freq) {
>     return (float)(1 + Math.log(1 + freq));
>   }
> 
>   public float sloppyFreq(int distance) {
>     return 1.0f / (distance + 1);
>   }
> 
>   public float idf(int docFreq, int numDocs) {
> float f = (float)(Math.log((double)numDocs/(double)(docFreq+1) + 1.0));
>     System.out.println("CosSimilarity.idf>" + f);
>     return 0.0f;
>   }
> 
>   public float coord(int overlap, int maxOverlap) {
>     return overlap / (float)maxOverlap;
>   }
> 
> }
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message