lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: TermInSetQuery keep terms order in results
Date Mon, 25 Jun 2018 19:49:52 GMT
Hi Nicola,

if you sort it elsewhere, why do you care about sort order then? What you see as result is
simple: As there is nothing available for scoring a constant score query returns the results
in index order. That's wanted. There is no way to change this "default" order for a TermInSetQuery
because it's missing information.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Nicola Buso <nbuso@ebi.ac.uk>
> Sent: Monday, June 25, 2018 5:09 PM
> To: Uwe Schindler <uwe@thetaphi.de>; java-user@lucene.apache.org
> Subject: Re: TermInSetQuery keep terms order in results
> 
> Hi Uwe,
> 
> thanks for the reply. TermInSetQuery cover most of my use case:
> - thousands of term values (also 100,000)
> - no need for scoring, because it's calculated elsewhere
> - intersect with normal full text query for further filtering
> 
> Using a TermQuery do I risk to hit the BooleanQuery.getMaxClauseCount()
> limit?
> 
> Cheers,
> 
> 
> Nicola
> 
> 
> 
> On Mon, 2018-06-25 at 16:52 +0200, Uwe Schindler wrote:
> > Hi,
> >
> > the TermInSetQuery is a so-called Constant Score Query. It is more
> > meant as a filter, so you would need some "real" fulltext query in
> > parallel. See the term-in-set query more like the SQL "IN" operator.
> > It can be used to pass lots of identifiers to filter results (e.g.
> > when you apply access rights or group policies for filtering users to
> > your main query as a filter).
> >
> > As it is a "set", which is by default unordered, the order of terms
> > in the set is undefined. Internally TermInSetQuery reorders the terms
> > to improve processing speed.
> >
> > If you need scoring, use TermQuery wrapped by a BooleanQuery. Then
> > you can apply some boosts to some terms to improve order (e.g. boost
> > term queries coming first) and apply on a field without norms.
> >
> > TermInSetQuery is fast because it neglects scoring and is just good
> > at intersecting the terms dict with the given terms set.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> > > -----Original Message-----
> > > From: Nicola Buso <nbuso@ebi.ac.uk>
> > > Sent: Monday, June 25, 2018 1:23 PM
> > > To: java-user@lucene.apache.org
> > > Subject: TermInSetQuery keep terms order in results
> > >
> > > Hi,
> > >
> > > I need to use the TermInSetQuery, but I would like to keep the
> > > sorting
> > > of the results based on the term set order provided. Currently
> > > seems
> > > using a index documents insertion order in the results.
> > >
> > > Is this already implemented somewhere or do I need to implement a
> > > CustomScoreQuery to calculate this score?
> > >
> > > Cheers,
> > >
> > >
> > > Nicola
> > >
> > >
> > > --
> > > Nicola Buso <nbuso@ebi.ac.uk>
> > > EMBL-EBI
> > >
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message