lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujit Pal <sujit....@comcast.net>
Subject Re: How to define different similarity scores per field ?
Date Tue, 01 Mar 2011 20:12:06 GMT
One way to do this currently is to build a per field similarity wrapper
(that triggers off the field name). I believe there is some work going
on with Lucene Similarity that would make it pluggable for this sort of
stuff, but in the meantime, this is what I did:

public class MyPerFieldSimilarityWrapper extends Similarity {

  public MyPerFieldSimilarityWrapper() {
    this.defaultSimilarity = new DefaultSimilarity();
    this.fieldSimilarityMap = new HashMap<String,Similarity>();
    this.fieldSimilarityMap.put("fieldA", new FieldASimilarity());
    ...
  }

  @Override
  public float lengthNorm(String fieldName, int numTokens) {
    Similarity sim = fieldSimilarityMap.get(fieldName);
    if (sim == null) {
      return defaultSimilarity.lengthNorm(fieldName, numTokens);
    } else {
      return sim.lengthNorm(fieldName, numTokens);
    }
  }
  // same for scorePayload. For the others, I just delegate 
  // to defaultSimilarity (all I really need is scorePayload in 
  // my case).
}

and in the schema.xml, I just set this class to be the similarity class:
  <similarity class="com.mycompany.MyPerFieldSimilarityWrapper"/>

hth
-sujit

On Tue, 2011-03-01 at 20:41 +0100, Patrick Diviacco wrote:
> I need to define different similarity scores per document field.
> 
> For example for field A I want to use Lucene tf.idf score, for the numerical
> field B I want to use a different metric (difference between values) and so
> on...
> 
> thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message