lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: COST vs SCORE vs WEIGHT
Date Fri, 01 Dec 2017 08:09:22 GMT
Your implementation is going to match all documents since the iterator
batches all documents. I believe you could do what you want with bultin
queries by doing

Query params_vendor = new ConstantScoreQuery(new TermQuery(new
Term("params_vendor", queryStr)), 5f);

and similarly for other queries.

Le jeu. 30 nov. 2017 à 17:58, Vadim Gindin <vgindin@detectum.com> a écrit :

> Thanks Adrien!
>
> 1) Here is my code snippet:
>
> Query params_vendor = new ConstTermQuery(new Term("params_vendor",
> queryStr), 5f);
> Query params_model = new ConstTermQuery(new Term("params_model",
> queryStr), 5f);
> Query params_value = new ConstTermQuery(new Term("params_value",
> queryStr), 3f);
> Query param_name = new ConstTermQuery(new Term("params_name", queryStr),
> 4f);
>
> BooleanQuery bq = expected
>         .add(params_vendor, BooleanClause.Occur.SHOULD)
>         .add(params_model, BooleanClause.Occur.SHOULD)
>         .add(params_value, BooleanClause.Occur.SHOULD)
>         .add(param_name, BooleanClause.Occur.SHOULD)
>         .setMinimumNumberShouldMatch(1)
>         .build()
>
>
> ConstTermQuery here is my custom Query that creates own WEIGHT and then
> SCORE. Created score returns just specified score in constructor (4 for
> "params_name"). Testing index does not contain fields "param_name" and
> "param_value". But returned Doc.score is 17 for all records. Why?
>
> 2) Scorer can iterate over matches. Isn't it?
>
> I used iterator in scorer constructor as follows:
>
> this.iterator = DocIdSetIterator.all(context.reader().maxDoc());
>
> And then
>
> public DocIdSetIterator iterator() {
>     return iterator;
> }
>
> Is that a correct implementation? Are there other ways to implement it?
>
> Thanks a lot for your response
>
> Regards,
> Vadim Gindin
>
> On Thu, Nov 30, 2017 at 8:56 PM, Adrien Grand <jpountz@gmail.com> wrote:
>
> > Hi Vadim,
> >
> > A Weight is the specialization of a query for a given index reader. It
> has
> > access to index statistics that will help compute scores for instance.
> >
> > A Scorer is the specialization of a weight for a given segment. It can
> > iterate over matches and compute scores.
> >
> > The cost of a scorer is the expected number of matching documents for
> this
> > scorer. It is useful in order to run operations in the optimal order
> >
> > Your observation of the behaviour of your BooleanQuery with SHOULD
> clauses
> > looks wrong: the score of the boolean query is the sum of the scores of
> the
> > matching sub queries.
> >
> > Le jeu. 30 nov. 2017 à 16:39, Vadim Gindin <vgindin@detectum.com> a
> écrit
> > :
> >
> > > Hi
> > >
> > > 1) What is the principal difference between COST vs SCORE vs WEIGHT
> > >
> > > 2) Assume we have BooleanQuery with 5 TermQuery subqueries that are
> > > included via SHOULD condition. Assume we have 5 fields and one subquery
> > is
> > > need to search in one field. Some product of MultiFieldQueryParser. In
> > this
> > > case the score of BooleanQuery is the sum of scores of each subquery. I
> > > expected that not all subqueries will be included but only those who
> > > founded something, but in fact there is a sum of all subqueries. Why?
> How
> > > to implement need logic: sum of those subqueries that found something?
> > How
> > > to check that?
> > >
> > > Regards,
> > > Vadim Gindin
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message