lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Boosting results
Date Fri, 07 Nov 2008 12:59:04 GMT
duuuuh, sorting. I absolutely love it when I overlook the obvious <G>.

Erick@ThatSlappingSoundYouHearIsMyHandAgainstMyForehead

On Fri, Nov 7, 2008 at 4:58 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Couldn't you just do a single Query that sorts first by category and second
> by relevance?
>
> Mike
>
>
> Erick Erickson wrote:
>
>  It seems to me that the easiest thing would be to fire two queries and
>> then just concatenate the results
>>
>> category:A AND body:fred
>>
>> category:B AND body:fred
>>
>>
>> If you really, really didn't want to fire two queries, you could create
>> filters on category A and category B and make a couple of
>> passes through your results seeing if the returned documents were in
>> the filter, but you'd still concatenate the results. Actually in your
>> specific example you could make one filter on A.....
>>
>> You could also consider a custom scorer that, added 1,000,000 to every
>> category A document.
>>
>> How much were you boosting by? What happens if you boost by a very large
>> factor?
>> As in ridiculously large?
>>
>> Best
>> Erick
>>
>> On Thu, Nov 6, 2008 at 7:42 PM, Scott Smith <ssmith@mainstreamdata.com
>> >wrote:
>>
>>  I'm interested in comments on the following problem.
>>>
>>>
>>>
>>> I have a set of documents.  They fall into 3 categories.  Call these
>>> categories A, B, and C.  Each document has an indexed, non-tokenized
>>> field called "category" which contains A, B, or C (they are mutually
>>> exclusive categories).
>>>
>>>
>>>
>>> All of the documents contain a field called "body" which contains a
>>> bunch of text.  This field is indexed and tokenized.
>>>
>>>
>>>
>>> So, I want to do a search which looks something like:
>>>
>>>
>>>
>>> (category:A OR category:B) AND body:fred
>>>
>>>
>>>
>>> I want all of the category A documents to come before the category B
>>> documents.  Effectively, I want to have the category A documents first
>>> (sorted by relevancy) and then the category B documents after (sorted by
>>> relevancy).
>>>
>>>
>>>
>>> I thought I could do this by boosting the category portion of the query,
>>> but that doesn't seem to work consistently.  I was setting the boost on
>>> the category A term to 1.0 and the boost on the category B term to 0.0.
>>>
>>>
>>>
>>> Any thoughts how to skin this?
>>>
>>>
>>>
>>> Scott
>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message