lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Index-time boosting: Deprecated setBoost method
Date Fri, 18 Oct 2019 18:31:48 GMT
Hi,

> Is there a working example for this? Is this mentioned in the Lucene
> Javadocs or any other docs so that i can look it?

To index the docvalues, see NumericDocValuesField (it can be added to documents like indexed
or stored fields). You may have used them for sorting already.

> this methodology seems sort of like discouraging using index time boosting.

Not really. Many use this all the time. It's one of the killer features of both Solr and Elasticsearch.
The problem was how the Document.setBoost()worked (it did not work correctly, see below).

> Previous setBoost method call was fine and easy to use.
> Did it have some performance issues and then is that why it was deprecated?

No the reason for deprecating this was for several reasons: setBoost was not doing what the
user had expected. Internally the boost value was just multiplied into the document norm factor
(which is internally also a docvalues field). The norm factors are only very inprecise floats
stored in a byte, so precision is not well. If you put some values into it and the length
norm was already consuming all bits, the boosting was very coarse. It was also only multiplied
into and most users want to do some stuff like record click counts in the index and then boost
for example with the logarithm or some other function. If the boost is just multiplied into
the length norm you have no flexibility at all.

In addition you can have several docvalues fields and use their values in a function (e.g.
one field with click count and another one with product price). After that you can combine
click count and price (which can be modified indipenently during index updates) and change
boost to boost lower price and higher click count up.

This is what you can do with the expressions module. You just give it a function.

Here is an example, the second example is using a FunctionScoreQuery that modifies the score
based on the function and the given docvalues:
https://lucene.apache.org/core/7_7_2/expressions/org/apache/lucene/expressions/Expression.html

> FunctionScoreQuery usage with MultiFieldQueryParser would also be nice
> where
> 
> MultiFieldQuery already has boosts field to do this in its constructor.

The boots in the query parser are applied for fields during query time (to have a different
weight per field). Index time boosting is per document. So you can combine both.

> Maybe it is not needed with MultiFieldQueryParser.

You use MultiFieldQueryParser to adjust weights of the fields (e.g. title versus body). The
parsed query is then wrapped with an expression that modifies the score per document according
to the docvalues.

Uwe

> On 10/18/19 1:28 PM, Uwe Schindler wrote:
> 
> > Hi,
> >
> > that's not true. You can do index time boosting, but you need to do that
> using a separate field. You just index a numeric docvalues field (which may
> contain a long or float value per document). Later you wrap your query with
> some FunctionScoreQuery (e.g., use the Javascript function query syntax in
> the expressions module). This allows you to compile a javascript function
> that calculated the final score based on the score returned by the inner query
> and combines them with docvalues that were indexed per document.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
> 8W80yE9L5xY&s=zgKmnmP9gLG4DlEnAfDdtBMEzPXtHNVYojxXIKEnQgs&e=
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: baris.kazar@oracle.com <baris.kazar@oracle.com>
> >> Sent: Friday, October 18, 2019 5:28 PM
> >> To: java-user@lucene.apache.org
> >> Cc: baris.kazar@oracle.com
> >> Subject: Re: Index-time boosting: Deprecated setBoost method
> >>
> >> It looks like index-time boosting (field) is not possible since Lucene
> >> version 7.7.2 and
> >>
> >> i was using before for another case the BoostQuery at search time for
> >> boosting and
> >>
> >> this seems to be the only boosting option now in Lucene.
> >>
> >> Best regards
> >>
> >>
> >> On 10/18/19 10:01 AM, baris.kazar@oracle.com wrote:
> >>> Hi,-
> >>>
> >>> i saw this in the Field class docs and i am figuring out the following
> >>> note in the docs:
> >>>
> >>> setBoost(float boost)
> >>> Deprecated.
> >>> Index-time boosts are deprecated, please index index-time scoring
> >>> factors into a doc value field and combine them with the score at
> >>> query time using eg. FunctionScoreQuery.
> >>>
> >>> I appreciate this note. Is there an example about this? I wish docs
> >>> would give a simple example to further help.
> >>>
> >>>
> >> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lucene.apache.org_core_6-5F6-
> 5F0__core_org_apache_lucene_document_&d=DwIFaQ&c=RoP1YumCXCga
> WHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
> 8W80yE9L5xY&s=rIVbw3_TGEwpaet5ibCeYze6vSDUiPhwOzlV0z484fM&e=
> >> Field.html
> >>>
> >>> vs
> >>>
> >>>
> >> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lucene.apache.org_core_7-5F7-
> 5F2_core_org_apache_lucene_document_F&d=DwIFaQ&c=RoP1YumCXCgaW
> HvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
> 8W80yE9L5xY&s=yt1toHHZQBqd3qKpWeSzywGJhy928Q5qaEO4v9Lj3vg&e=
> >> ield.html
> >>>
> >>> Best regards
> >>>
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message