lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Ginzburg <davidginzb...@gmail.com>
Subject synonym payload boosting
Date Sun, 08 Nov 2009 13:23:28 GMT
Hi,
I have a field and a wighted synonym map.
I have indexed the synonyms with the weight as payload.
my code snippet from my filter

*public Token next(final Token reusableToken) throws IOException *
*        . *
*        . *
*        .*
       * Payload boostPayload;*
*
*
*        for (Synonym synonym : syns) {*
*            *
*            Token newTok = new Token(nToken.startOffset(),
nToken.endOffset(), "SYNONYM");*
*            newTok.setTermBuffer(synonym.getToken().toCharArray(), 0,
synonym.getToken().length());*
*            // set the position increment to zero*
*            // this tells lucene the synonym is*
*            // in the exact same location as the originating word*
*            newTok.setPositionIncrement(0);*
*            boostPayload = new
Payload(PayloadHelper.encodeFloat(synonym.getWieght()));*
*            newTok.setPayload(boostPayload);*
*
*
I have put it in the index time analyzer : this is my field definition:

*
<fieldType name="PersonName" class="solr.TextField"
positionIncrementGap="100" >
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="com.digitaltrowel.solr.DTSynonymFactory"
FreskoFunction="names_with_scoresPipe23Columns.txt" ignoreCase="true"
expand="false"/>

        <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
        <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>-->
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <!--<filter class="com.digitaltrowel.solr.DTSynonymFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>-->
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
        <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/    >-->
      </analyzer>
    </fieldType>


my similarity class is
public class BoostingSymilarity extends DefaultSimilarity {


    public BoostingSymilarity(){
        super();

  }
    @Override
    public  float scorePayload(String field, byte [] payload, int offset,
int length)
{
 double weight = PayloadHelper.decodeFloat(payload, 0);
return (float)weight;
 }

@Override public float coord(int overlap, int maxoverlap)
 {
return 1.0f;
}

@Override public float idf(int docFreq, int numDocs)
{
 return 1.0f;
}

@Override public float lengthNorm(String fieldName, int numTerms)
 {
return 1.0f;
}

@Override public float tf(float freq)
{
 return 1.0f;
}
}

My problem is that scorePayload method does not get called at search time
like the other methods in  my similarity class.
I tested and verified it with break points.
What am I doing wrong?
I used solr 1.3 and thinking of the payload boos support in solr 1.4.


*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message