lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From scott w <scottbl...@gmail.com>
Subject Re: Question about how to speed up custom scoring
Date Fri, 09 Oct 2009 21:19:46 GMT
Right exactly. I looked into payload initially and realized it wouldn't work
for my use case.

On Fri, Oct 9, 2009 at 2:00 PM, Grant Ingersoll <gsingers@apache.org> wrote:

> Oops, just reread and realized you wanted query time weights.  Payloads are
> an index time thing.
>
>
> On Oct 9, 2009, at 5:49 PM, Grant Ingersoll wrote:
>
>  If you are trying to add specific term weights to terms in the index and
>> then incorporate them into scoring, you might benefit from payloads and the
>> PayloadTermQuery option.  See
>> http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
>>
>> -Grant
>>
>> On Oct 8, 2009, at 11:56 AM, scott w wrote:
>>
>>  Oops, forgot to include the class I mentioned. Here it is:
>>>
>>> public class QueryTermBoostingQuery extends CustomScoreQuery {
>>> private Map<String, Float> queryTermWeights;
>>> private float bias;
>>> private IndexReader indexReader;
>>>
>>> public QueryTermBoostingQuery( Query q, Map<String, Float> termWeights,
>>> IndexReader indexReader, float bias) {
>>>  super( q );
>>>  this.indexReader = indexReader;
>>>  if (bias < 0 || bias > 1) {
>>>    throw new IllegalArgumentException( "Bias must be between 0 and 1" );
>>>  }
>>>  this.bias = bias;
>>>  queryTermWeights = termWeights;
>>> }
>>>
>>> @Override
>>> public float customScore( int doc, float subQueryScore, float valSrcScore
>>> ) {
>>>  Document document;
>>>  try {
>>>    document = indexReader.document( doc );
>>>  } catch (IOException e) {
>>>    throw new SearchException( e );
>>>  }
>>>  float termWeightedScore = 0;
>>>
>>>  for (String field : queryTermWeights.keySet()) {
>>>    String docFieldValue = document.get( field );
>>>    if (docFieldValue != null) {
>>>      Float weight = queryTermWeights.get( field );
>>>      if (weight != null) {
>>>        termWeightedScore += weight * Float.parseFloat( docFieldValue );
>>>      }
>>>    }
>>>  }
>>>  return bias * subQueryScore + (1 - bias) * termWeightedScore;
>>> }
>>> }
>>>
>>> On Thu, Oct 8, 2009 at 7:54 AM, scott w <scottblanc@gmail.com> wrote:
>>>
>>>  I am trying to come up with a performant query that will allow me to use
>>>> a
>>>> custom score where the custom score is a sum-product over a set of query
>>>> time weights where each weight gets applied only if the query time term
>>>> exists in the document . So for example if I have a doc with three
>>>> fields:
>>>> company=Microsoft, city=Redmond, and size=large, I may want to score
>>>> that
>>>> document according to the following function: city==Microsoft ? .3 : 0 *
>>>> size ==large ? 0.5 : 0 to get a score of 0.8. Attached is a subclass I
>>>> have
>>>> tested that implements this with one extra component which is that it
>>>> allow
>>>> the relevance score to be combined in.
>>>>
>>>> The problem is this custom score is not performant at all. For example,
>>>> on
>>>> a small index of 5 million documents with 10 weights passed in it does
>>>> 0.01
>>>> req/sec.
>>>>
>>>> Are there ways to make to compute the same custom score but in a much
>>>> more
>>>> performant way?
>>>>
>>>> thanks,
>>>> Scott
>>>>
>>>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message