lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damerian <dameria...@gmail.com>
Subject Custom scoring
Date Thu, 23 Feb 2012 14:10:19 GMT
Hello,
  I am trying to implement my own Jaccard similarity for Lucene.
So far i have the following code
public class JaccardSimilarity extends DefaultSimilarity {
     int numberOfDocumentTerms;
     //String field="contents"; // Should the Jaccard similarity be only 
based in the contents field????

     @Override
     public float idf(int i, int i1) {
     return 1;
   }
     @Override
     public float tf(int i) {
     return 1;
   }

     public int getNumberOfDocumentTerms() {
         return numberOfDocumentTerms;
     }

     public void setNumberOfDocumentTerms(int numberOfDocumentTerms) {
         this.numberOfDocumentTerms = numberOfDocumentTerms;
     }

     @Override
     public float queryNorm(float i) {
     return 1.0f;
   }
     @Override
     public float computeNorm(String field, FieldInvertState state) {


         numberOfDocumentTerms=state.getLength();//for each field we get 
the number of terms
         setNumberOfDocumentTerms(numberOfDocumentTerms);

         System.out.println("numberOfDocumentTerms from compute : " + 
numberOfDocumentTerms);
     return 1.0f;
   }

     @Override
     public float coord(int overlap, int maxOverlap) {
         System.out.println("numberOfDocumentTerms : " + 
getNumberOfDocumentTerms());
     return (overlap/(numberOfDocumentTerms+(maxOverlap-overlap)));
   }
}

The problem is that coord() method is not used (or at least so that i 
understand) neither in searching nor in indexing
What do i do wrong? i need the

    |overlap| - the number of query terms matched in the document
    |maxOverlap| - the total number of terms in the query
to implement my scoring.
Any help would be highly appreciated
Thank you in advance!


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message