From java-user-return-63790-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Mon Jun 25 21:50:00 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id E90ED180627 for ; Mon, 25 Jun 2018 21:49:59 +0200 (CEST) Received: (qmail 60514 invoked by uid 500); 25 Jun 2018 19:49:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 60480 invoked by uid 99); 25 Jun 2018 19:49:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jun 2018 19:49:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 844A51A2CA0 for ; Mon, 25 Jun 2018 19:49:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.502 X-Spam-Level: X-Spam-Status: No, score=-1.502 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id XwAGjLNpzwkf for ; Mon, 25 Jun 2018 19:49:55 +0000 (UTC) Received: from mail.sd-datasolutions.de (serv2.sd-datasolutions.de [78.47.65.36]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 8E4EB5F3CE for ; Mon, 25 Jun 2018 19:49:55 +0000 (UTC) Received: from VEGA (p200300C107272A01581950774176C779.dip0.t-ipconnect.de [IPv6:2003:c1:727:2a01:5819:5077:4176:c779]) by mail.sd-datasolutions.de (Postfix) with ESMTPSA id 7BB05480829; Mon, 25 Jun 2018 19:49:53 +0000 (UTC) X-NSA-Greeting: Dear NSA, have fun with reading and analyzing this e-mail! From: "Uwe Schindler" To: , References: <51843bdcc4aedb9ab899164104aa599486b52ecc.camel@ebi.ac.uk> <0b5d01d40c94$179c7650$46d562f0$@thetaphi.de> <1a80b8947b3b72758ff381bfee886a96194373b7.camel@ebi.ac.uk> In-Reply-To: <1a80b8947b3b72758ff381bfee886a96194373b7.camel@ebi.ac.uk> Subject: RE: TermInSetQuery keep terms order in results Date: Mon, 25 Jun 2018 21:49:52 +0200 Message-ID: <0baf01d40cbd$af64fb50$0e2ef1f0$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQGJPcqlSCfPebLtkiN+MqeTwRVLGwGaFiW6Ak2gEpSk58GB0A== Content-Language: de Hi Nicola, if you sort it elsewhere, why do you care about sort order then? What = you see as result is simple: As there is nothing available for scoring a = constant score query returns the results in index order. That's wanted. = There is no way to change this "default" order for a TermInSetQuery = because it's missing information. Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Nicola Buso > Sent: Monday, June 25, 2018 5:09 PM > To: Uwe Schindler ; java-user@lucene.apache.org > Subject: Re: TermInSetQuery keep terms order in results >=20 > Hi Uwe, >=20 > thanks for the reply. TermInSetQuery cover most of my use case: > - thousands of term values (also 100,000) > - no need for scoring, because it's calculated elsewhere > - intersect with normal full text query for further filtering >=20 > Using a TermQuery do I risk to hit the = BooleanQuery.getMaxClauseCount() > limit? >=20 > Cheers, >=20 >=20 > Nicola >=20 >=20 >=20 > On Mon, 2018-06-25 at 16:52 +0200, Uwe Schindler wrote: > > Hi, > > > > the TermInSetQuery is a so-called Constant Score Query. It is more > > meant as a filter, so you would need some "real" fulltext query in > > parallel. See the term-in-set query more like the SQL "IN" operator. > > It can be used to pass lots of identifiers to filter results (e.g. > > when you apply access rights or group policies for filtering users = to > > your main query as a filter). > > > > As it is a "set", which is by default unordered, the order of terms > > in the set is undefined. Internally TermInSetQuery reorders the = terms > > to improve processing speed. > > > > If you need scoring, use TermQuery wrapped by a BooleanQuery. Then > > you can apply some boosts to some terms to improve order (e.g. boost > > term queries coming first) and apply on a field without norms. > > > > TermInSetQuery is fast because it neglects scoring and is just good > > at intersecting the terms dict with the given terms set. > > > > Uwe > > > > ----- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > http://www.thetaphi.de > > eMail: uwe@thetaphi.de > > > > > -----Original Message----- > > > From: Nicola Buso > > > Sent: Monday, June 25, 2018 1:23 PM > > > To: java-user@lucene.apache.org > > > Subject: TermInSetQuery keep terms order in results > > > > > > Hi, > > > > > > I need to use the TermInSetQuery, but I would like to keep the > > > sorting > > > of the results based on the term set order provided. Currently > > > seems > > > using a index documents insertion order in the results. > > > > > > Is this already implemented somewhere or do I need to implement a > > > CustomScoreQuery to calculate this score? > > > > > > Cheers, > > > > > > > > > Nicola > > > > > > > > > -- > > > Nicola Buso > > > EMBL-EBI > > > > > > ----------------------------------------------------------------- > > > ---- > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org