lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: Lucene Field score value
Date Tue, 31 Jul 2007 21:48:34 GMT
You can boost any clause of a query:

http://lucene.apache.org/java/docs/queryparsersyntax.html

title:foo^5 header:foo^2 body:foo

On 31-Jul-07, at 1:00 PM, Askar Zaidi wrote:

> I'll have to use StringBuffer and get the Explanation in it as a  
> String.
> Then parse StringBuffer to get the scores of each field, then add  
> them and
> then boost the scores. That seems to be a non-trivial task. Is  
> there any
> other way around it ?
>
> Considering Boosting, can I boost the score of a field ? I came across
> Boosting of a term in the query so that would mean,
>
> "apache^4 jakarta"
>
> This means I am more keen to find apache than jakarta. I am keen to  
> boost
> the score of a field, how can that be done ?
>
> On 7/31/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
>>
>> Using the Explanation method can help me get the exact score of a  
>> field. I
>> am concerned with how I can access it , this is what I am doing:
>>
>>
>>  for(int i=0;i<hitCount;i++){
>>                 doc = hits.doc(i);
>>                 score = hits.score(i);
>>                 int ID = hits.id(i);
>>                 System.out.println( id + "\t" + score + "\t" +  
>> doc.get
>> ("item"));
>>                 Explanation exp = searcher.explain( query , ID );
>>                 System.out.println( exp.toString() );
>>                 }
>>
>> This shows me o/p as follows (trimmed output):
>> 6.6197186E-4 = (MATCH) product of:
>>   0.0033098592 = (MATCH) sum of:
>>     0.0033098592 = (MATCH) weight(contents:innovation in 4),  
>> product of:
>>       0.28244132 = queryWeight(contents:innovation), product of:
>>         1.0 = idf(docFreq=5)
>>         0.28244132 = queryNorm
>>       0.01171875 = (MATCH) fieldWeight(contents:innovation in 4),  
>> product
>> of:
>>         1.0 = tf(termFreq(contents:innovation)=1)
>>         1.0 = idf(docFreq=5)
>>         0.01171875 = fieldNorm(field=contents, doc=4)
>>   0.2 = coord(1/5)
>>
>> The 6.6197E-4 is the total score of the document. How can I access  
>> the
>> fieldWeight ? Any example would be great.
>>
>> thanks,
>> AZ
>>
>>
>> On 7/31/07, Erick Erickson <erickerickson@gmail.com> wrote:
>>>
>>> Boost the other three fields at search time. Boosting during
>>> index time expresses "this document's title is worth more than
>>> other doucments' titles". Boosting during search time expresses
>>> "I care about matches on this clause more than I do on other
>>> clauses".
>>>
>>> Will it help? How should I know? It's *your* application and
>>> *you* are the one who is dissatisfied with the current scoring <G>.
>>>
>>> All I can say is try it and find out. You might consider using Luke
>>> to try various boosts without having to mess with too much code.
>>>
>>> Erick
>>>
>>> On 7/31/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
>>>>
>>>> Boosting during Indexing or boosting during search ?
>>>>
>>>> I have 4 fields:
>>>>
>>>> {tags},{title},{summary},{contents}
>>>>
>>>> Typically a phrase occurs too many times in contents as compared to
>>> the
>>>> other fields. If I get the score of contents field , I can pass it
>>> through
>>>> an adjuster function which will bring the score down. Something  
>>>> like:
>>>>
>>>> public static double adjuster(double count){
>>>>
>>>>                 double newCount;
>>>>                 newCount = 1/Math.exp(count);
>>>>                 System.out.println(newCount);
>>>>                 return newCount;
>>>>
>>>>                 }
>>>>
>>>> Do you mean I could boost the values of the other 3 fields ?  
>>>> Will that
>>>> help
>>>> ? Would there be a way to bring down the score of the contents  
>>>> field ?
>>>>
>>>> thanks,
>>>> AZ
>>>>
>>>>
>>>> On 7/31/07, Erick Erickson <erickerickson@gmail.com > wrote:
>>>>>
>>>>> Wouldn't boosting handle this for you?
>>>>>
>>>>> On 7/31/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
>>>>>>
>>>>>> To be more specific:
>>>>>>
>>>>>> I want to retrieve the scores of individual fields inside a
>>> document
>>>> so
>>>>>> that
>>>>>> I can manipulate the score of one field. This is the requirement
>>> of my
>>>>>> application. After the manipulation I can add these scores and
>>> then
>>>> show
>>>>>> the
>>>>>> total.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> AZ
>>>>>>
>>>>>> On 7/31/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
>>>>>>>
>>>>>>> Hey guys,
>>>>>>>
>>>>>>> I was wondering if there is a way to retrieve score of a field
>>> in a
>>>>>>> document ?
>>>>>>>
>>>>>>> If my document looks like this:
>>>>>>>
>>>>>>> {itemID},{field 1},{field 2}
>>>>>>>
>>>>>>> I'd like to get score of individual fields 1 and 2 rather than
>>> the
>>>>> score
>>>>>>> of the entire document.
>>>>>>>
>>>>>>> Is it possible ?
>>>>>>>
>>>>>>> thanks,
>>>>>>> AZ
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message