lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: synonym payload boosting
Date Sun, 08 Nov 2009 13:46:32 GMT
You might get an answer on the solr list. This is the lucene users list.

Simon

On Nov 8, 2009 2:24 PM, "David Ginzburg" <davidginzburg@gmail.com> wrote:

Hi,
I have a field and a wighted synonym map.
I have indexed the synonyms with the weight as payload.
my code snippet from my filter

*public Token next(final Token reusableToken) throws IOException *
*        . *
*        . *
*        .*
      * Payload boostPayload;*
*
*
*        for (Synonym synonym : syns) {*
*            *
*            Token newTok = new Token(nToken.startOffset(),
nToken.endOffset(), "SYNONYM");*
*            newTok.setTermBuffer(synonym.getToken().toCharArray(), 0,
synonym.getToken().length());*
*            // set the position increment to zero*
*            // this tells lucene the synonym is*
*            // in the exact same location as the originating word*
*            newTok.setPositionIncrement(0);*
*            boostPayload = new
Payload(PayloadHelper.encodeFloat(synonym.getWieght()));*
*            newTok.setPayload(boostPayload);*
*
*
I have put it in the index time analyzer : this is my field definition:

*
<fieldType name="PersonName" class="solr.TextField"
positionIncrementGap="100" >
     <analyzer type="index">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="com.digitaltrowel.solr.DTSynonymFactory"
FreskoFunction="names_with_scoresPipe23Columns.txt" ignoreCase="true"
expand="false"/>

       <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
       <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>-->
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <!--<filter class="com.digitaltrowel.solr.DTSynonymFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="false"/>-->
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
       <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/    >-->
     </analyzer>
   </fieldType>


my similarity class is
public class BoostingSymilarity extends DefaultSimilarity {


   public BoostingSymilarity(){
       super();

 }
   @Override
   public  float scorePayload(String field, byte [] payload, int offset,
int length)
{
 double weight = PayloadHelper.decodeFloat(payload, 0);
return (float)weight;
 }

@Override public float coord(int overlap, int maxoverlap)
 {
return 1.0f;
}

@Override public float idf(int docFreq, int numDocs)
{
 return 1.0f;
}

@Override public float lengthNorm(String fieldName, int numTerms)
 {
return 1.0f;
}

@Override public float tf(float freq)
{
 return 1.0f;
}
}

My problem is that scorePayload method does not get called at search time
like the other methods in  my similarity class.
I tested and verified it with break points.
What am I doing wrong?
I used solr 1.3 and thinking of the payload boos support in solr 1.4.


*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message