lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Boosting results
Date Fri, 07 Nov 2008 09:58:22 GMT

Couldn't you just do a single Query that sorts first by category and  
second by relevance?

Mike

Erick Erickson wrote:

> It seems to me that the easiest thing would be to fire two queries and
> then just concatenate the results
>
> category:A AND body:fred
>
> category:B AND body:fred
>
>
> If you really, really didn't want to fire two queries, you could  
> create
> filters on category A and category B and make a couple of
> passes through your results seeing if the returned documents were in
> the filter, but you'd still concatenate the results. Actually in your
> specific example you could make one filter on A.....
>
> You could also consider a custom scorer that, added 1,000,000 to every
> category A document.
>
> How much were you boosting by? What happens if you boost by a very  
> large
> factor?
> As in ridiculously large?
>
> Best
> Erick
>
> On Thu, Nov 6, 2008 at 7:42 PM, Scott Smith  
> <ssmith@mainstreamdata.com>wrote:
>
>> I'm interested in comments on the following problem.
>>
>>
>>
>> I have a set of documents.  They fall into 3 categories.  Call these
>> categories A, B, and C.  Each document has an indexed, non-tokenized
>> field called "category" which contains A, B, or C (they are mutually
>> exclusive categories).
>>
>>
>>
>> All of the documents contain a field called "body" which contains a
>> bunch of text.  This field is indexed and tokenized.
>>
>>
>>
>> So, I want to do a search which looks something like:
>>
>>
>>
>> (category:A OR category:B) AND body:fred
>>
>>
>>
>> I want all of the category A documents to come before the category B
>> documents.  Effectively, I want to have the category A documents  
>> first
>> (sorted by relevancy) and then the category B documents after  
>> (sorted by
>> relevancy).
>>
>>
>>
>> I thought I could do this by boosting the category portion of the  
>> query,
>> but that doesn't seem to work consistently.  I was setting the  
>> boost on
>> the category A term to 1.0 and the boost on the category B term to  
>> 0.0.
>>
>>
>>
>> Any thoughts how to skin this?
>>
>>
>>
>> Scott
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message