lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Gindin <vgin...@detectum.com>
Subject Re: COST vs SCORE vs WEIGHT
Date Thu, 30 Nov 2017 16:58:45 GMT
Thanks Adrien!

1) Here is my code snippet:

Query params_vendor = new ConstTermQuery(new Term("params_vendor",
queryStr), 5f);
Query params_model = new ConstTermQuery(new Term("params_model", queryStr), 5f);
Query params_value = new ConstTermQuery(new Term("params_value", queryStr), 3f);
Query param_name = new ConstTermQuery(new Term("params_name", queryStr), 4f);

BooleanQuery bq = expected
        .add(params_vendor, BooleanClause.Occur.SHOULD)
        .add(params_model, BooleanClause.Occur.SHOULD)
        .add(params_value, BooleanClause.Occur.SHOULD)
        .add(param_name, BooleanClause.Occur.SHOULD)
        .setMinimumNumberShouldMatch(1)
        .build()


ConstTermQuery here is my custom Query that creates own WEIGHT and then
SCORE. Created score returns just specified score in constructor (4 for
"params_name"). Testing index does not contain fields "param_name" and
"param_value". But returned Doc.score is 17 for all records. Why?

2) Scorer can iterate over matches. Isn't it?

I used iterator in scorer constructor as follows:

this.iterator = DocIdSetIterator.all(context.reader().maxDoc());

And then

public DocIdSetIterator iterator() {
    return iterator;
}

Is that a correct implementation? Are there other ways to implement it?

Thanks a lot for your response

Regards,
Vadim Gindin

On Thu, Nov 30, 2017 at 8:56 PM, Adrien Grand <jpountz@gmail.com> wrote:

> Hi Vadim,
>
> A Weight is the specialization of a query for a given index reader. It has
> access to index statistics that will help compute scores for instance.
>
> A Scorer is the specialization of a weight for a given segment. It can
> iterate over matches and compute scores.
>
> The cost of a scorer is the expected number of matching documents for this
> scorer. It is useful in order to run operations in the optimal order
>
> Your observation of the behaviour of your BooleanQuery with SHOULD clauses
> looks wrong: the score of the boolean query is the sum of the scores of the
> matching sub queries.
>
> Le jeu. 30 nov. 2017 à 16:39, Vadim Gindin <vgindin@detectum.com> a écrit
> :
>
> > Hi
> >
> > 1) What is the principal difference between COST vs SCORE vs WEIGHT
> >
> > 2) Assume we have BooleanQuery with 5 TermQuery subqueries that are
> > included via SHOULD condition. Assume we have 5 fields and one subquery
> is
> > need to search in one field. Some product of MultiFieldQueryParser. In
> this
> > case the score of BooleanQuery is the sum of scores of each subquery. I
> > expected that not all subqueries will be included but only those who
> > founded something, but in fact there is a sum of all subqueries. Why? How
> > to implement need logic: sum of those subqueries that found something?
> How
> > to check that?
> >
> > Regards,
> > Vadim Gindin
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message