lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: questions regarding index boost vs search boost for multivalued fields
Date Sat, 28 Aug 2010 06:45:15 GMT
Index time and query time boosts have different meanings.
Boosting at index time says "this document's title is more
important than other documents titles". Query time boosts
express "I want matches in the title of a document to count
more than matches in other fields for this query". I guess if
you boosted the titles in all the documents, it'd be about the
same as boosting every title match in every query...

But relying on index-time boosting is not very flexible. To
change the boosts, you have to re-index which can be
very expensive. And you really have no idea what effect
boosts have until you run some trials. Even worse, for
your boosts to have the effect you want, you may have to
change the values as your corpus changes. Which is
very easy to do with query-time boosts, but requires
reindexing with index-time boosts.

In terms of cost, I don't think that there's any noticeable
advantage either way. Query time boosts will probably cause
an extra multiplication, but that's such a minor part of the
total cost of a query that I'd ignore it to start with.

Finally, I think (but don't know for sure) that boosting multiple
values of a multi-valued field uses the last boost set for the
entire field. I'd be surprised if the multi-valued boosts were
kept separately (but I've been surprised before).

Best
Erick

On Fri, Aug 27, 2010 at 7:11 PM, Qi Li <alertli@gmail.com> wrote:

> Here is my index structure.
>  for each document:
>      Field   articleTitle     (only one value)
>      Field   majorHeading    (multiple values)
>      Field   minorHeading     (multiple values)
>
> I use heading (can be both majorHeadings or minorHeadings) to search.  What
> I want is that majorHeading is more important than minorHeading.  I can
> boost the majorHeading during index in two ways
>      method 1:    for every major field of the same document,
> field.setBoost(2f)
>      method 2:    only the first major field of the same document,
> field.setBoost(2f)
>
> Looks like both ways give me the correct result.
> Questions 1 :  What is the difference between boosting only the first field
> or boosting all fields for a multivalued field?
>
> In addition, I can also boost the majorHeading during searching if I choose
> not to boost in index time
> Question 2    What is the trade-off between index boost and search boost?
>
> I will appreciate your help a lot.
>
> Best regards,
> Qi Li
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message