lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Smith" <ssm...@mainstreamdata.com>
Subject RE: Boosting results
Date Fri, 07 Nov 2008 16:17:04 GMT
Well, it's not like sorting hadn't occurred to me.  Unfortunately, what
I recalled was that you could only sort results on one field (I do date
sorted searches all the time in my application).  I should have gone
back and looked.  My memory failed me as I can see that you can sort on
multiple fields and "score" (aka relevancy) is one of the pseudo fields.
That'll work.

Thanks.

Scott

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Friday, November 07, 2008 5:59 AM
To: java-user@lucene.apache.org
Subject: Re: Boosting results

duuuuh, sorting. I absolutely love it when I overlook the obvious <G>.

Erick@ThatSlappingSoundYouHearIsMyHandAgainstMyForehead

On Fri, Nov 7, 2008 at 4:58 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Couldn't you just do a single Query that sorts first by category and
second
> by relevance?
>
> Mike
>
>
> Erick Erickson wrote:
>
>  It seems to me that the easiest thing would be to fire two queries
and
>> then just concatenate the results
>>
>> category:A AND body:fred
>>
>> category:B AND body:fred
>>
>>
>> If you really, really didn't want to fire two queries, you could
create
>> filters on category A and category B and make a couple of
>> passes through your results seeing if the returned documents were in
>> the filter, but you'd still concatenate the results. Actually in your
>> specific example you could make one filter on A.....
>>
>> You could also consider a custom scorer that, added 1,000,000 to
every
>> category A document.
>>
>> How much were you boosting by? What happens if you boost by a very
large
>> factor?
>> As in ridiculously large?
>>
>> Best
>> Erick
>>
>> On Thu, Nov 6, 2008 at 7:42 PM, Scott Smith
<ssmith@mainstreamdata.com
>> >wrote:
>>
>>  I'm interested in comments on the following problem.
>>>
>>>
>>>
>>> I have a set of documents.  They fall into 3 categories.  Call these
>>> categories A, B, and C.  Each document has an indexed, non-tokenized
>>> field called "category" which contains A, B, or C (they are mutually
>>> exclusive categories).
>>>
>>>
>>>
>>> All of the documents contain a field called "body" which contains a
>>> bunch of text.  This field is indexed and tokenized.
>>>
>>>
>>>
>>> So, I want to do a search which looks something like:
>>>
>>>
>>>
>>> (category:A OR category:B) AND body:fred
>>>
>>>
>>>
>>> I want all of the category A documents to come before the category B
>>> documents.  Effectively, I want to have the category A documents
first
>>> (sorted by relevancy) and then the category B documents after
(sorted by
>>> relevancy).
>>>
>>>
>>>
>>> I thought I could do this by boosting the category portion of the
query,
>>> but that doesn't seem to work consistently.  I was setting the boost
on
>>> the category A term to 1.0 and the boost on the category B term to
0.0.
>>>
>>>
>>>
>>> Any thoughts how to skin this?
>>>
>>>
>>>
>>> Scott
>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message