lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Problem with sorted results.
Date Fri, 26 Dec 2008 19:01:31 GMT
Several points:

1> Prabin's suggestion is, I think, equivalent to something like:
title:(business AND author)^9.0 OR
author:(business AND author)^3.0
OR description:(business AND author)^1.0.
Either way may give you results more in line with what
you expect. (note, the syntax here is suspect, but you get the idea)
I'm not sure what the boosting in your original clause
really does, but it seems suspect... but then I'm no
scoring expert.

2> If you haven't already, get a copy of Luke and
run a few of these scenarios through the "explain" function.
That'll show you a lot about how the queries are scored. And
Query.explain is your friend too.

3> Boosts are a modifier of the score, so expecting the
result sets to be simply predictable will probably surprise you.
Boosting will *tend* to sort things the way you want, but
a document whose description score is 10000 and title score is
1 will still sort above other title scores (numbers pulled out of thin
air). That said, since your title and author fields will very probably
be shorter than description, they'll tend to sort first anyway
I suspect.

Best
Erick

On Fri, Dec 26, 2008 at 12:46 AM, vikas bucha <vikasbucha@gmail.com> wrote:

> Hi,
>
> Merry Christmas and a Happy New Year to you all.
>
> I have an index with few fields. Title, Description, Author etc.
> For a search query "business development", the equivalent lucene query I
> build is:
>
> *(TITLE: business^9.00 OR AUTHOR: business^3.00 OR
> DESCRIPTION:business^1.00) AND (TITLE: development^9.00 OR AUTHOR:
> development^3.00 OR DESCRIPTION:development^1.00)
> *
> The expected result is to get all documents having both *business* and *
> development* in the *TITLE* on top, followed by any one in TITLE and the
> other in AUTHOR(if available) followed by both in AUTHOR and so on..., till
> we get all docs having all the terms appearing anywhere in the document.
> But
> the results are completely different from the expectation.
>
> Please reply if you'd require any other information.
>
> This might be a trivial issue for the pros. I have pulled my hair at it for
> a while.
> Any help with what's wrong here would be highly appreciated.
>
> Thanks
> Vikash.
> --
> If there's no way out, then let's make one.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message