Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 538D0200D5A for ; Thu, 14 Dec 2017 10:45:18 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 51F0F160C04; Thu, 14 Dec 2017 09:45:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6DE48160C01 for ; Thu, 14 Dec 2017 10:45:17 +0100 (CET) Received: (qmail 25680 invoked by uid 500); 14 Dec 2017 09:45:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 25668 invoked by uid 99); 14 Dec 2017 09:45:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Dec 2017 09:45:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 4B64A1A0F21 for ; Thu, 14 Dec 2017 09:45:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.48 X-Spam-Level: ** X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=detectum-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id JEVzjHaixaFU for ; Thu, 14 Dec 2017 09:45:11 +0000 (UTC) Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com [209.85.220.178]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 97B665F395 for ; Thu, 14 Dec 2017 09:45:11 +0000 (UTC) Received: by mail-qk0-f178.google.com with SMTP id i130so5469554qke.4 for ; Thu, 14 Dec 2017 01:45:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=detectum-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=wiRkbhBsmfG7YIDmfIwEQGeE7HHYhP70tv9H55KgmPc=; b=CxUu/J5HASccHxWKbFRSRp4qIN+tcoc8zcB2pKlZ4M3Gnfp3Rl9XeAto4JS6xHp/zB haBXbD3DAljqIzWgNSpuVl8i0215Gw11OCG/SB/OhwNM2r2Wz0r33yjd0qjxgf0UxUGW 15KIivgekisRX+XUeGAUe8wIpUZR7Iqo/DQuoRxECVixCeZI9ZLtBZtdbRCaVP5uQdVN 9s9aK9SUhdL8JgLxzAw7E/NVni9ypsLWj8JsQ5G62U2aXlTIxaNxzhrGyAtqkCa2t0h6 NJaGbiHr9A45yDGWcYQjdugigMEznOtHXIn/pC/UfShYdvQ5/23zVxBMjJbE9Bl63+h6 I8Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=wiRkbhBsmfG7YIDmfIwEQGeE7HHYhP70tv9H55KgmPc=; b=Xr0So8ze6XmMpEKIa+quZlk9P3mGoeJdgjyAe2GKZkt5YswVpn96EJmjTqa6C+IfLu 497XsvHegpwF2hiiKCjtXAUYlgz+8OZVdwJbXWwjBDFCY8jvNJVfsqTbLv1XOpvnLkRW z5V2xOHfEaFEylWZEsEkP15D68PLLt3ojQzlTpuAPGv1Hi/YpjY9i60Qc6AsVM4U5GSZ NH/CYn0crEjRMcaF7Ef/vpN0xkoeB+bPuAAwdNRQPFxVmGAuubzNeocOPgeG7Xjke0dl S2fTlL7KM2o04I8FWHtcN8YK/DOfwx/99/7R55kAgBvoMqlEV47vLeTpsnLL3s4cj0Y1 fTJw== X-Gm-Message-State: AKGB3mLYCVgtocSoG3BfXx7jeJCboyV7d1+OThrxXhGT2qatwAhqDNbp /uH/uElcCRHx+/YN/3sMbKupXkxLIG9F12pfO7b32A== X-Google-Smtp-Source: ACJfBou/tpufgOVllp8pIleiY0Fo5NmC/DWnNNkCxlfl2eyVh2tr/pkBGLNF9pSayzT6SLKjvX2gpQFVB8/Clc/G1jQ= X-Received: by 10.55.75.202 with SMTP id y193mr14394691qka.118.1513244711033; Thu, 14 Dec 2017 01:45:11 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.88.16 with HTTP; Thu, 14 Dec 2017 01:45:10 -0800 (PST) In-Reply-To: References: From: Vadim Gindin Date: Thu, 14 Dec 2017 14:45:10 +0500 Message-ID: Subject: Re: Tracking that all query terms are matched in one document To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary="001a114a930675b1fe056049bdae" archived-at: Thu, 14 Dec 2017 09:45:18 -0000 --001a114a930675b1fe056049bdae Content-Type: text/plain; charset="UTF-8" Thank you On Wed, Dec 13, 2017 at 3:32 PM, Mikhail Khludnev wrote: > There are two algorithm for scoring disjunction: term-a-time, doc-at-time. > The former was called BooleanScorer and the later was called > BooleanScorer2. > I remember that they was drastically renamed and/or replaced with > BulkScorer or so. Anyway, you need to find a way to prevent term-at-time > scoring, when FakeScorer is injected. > You need to make it score doc-at-time. As I told you, it's far way. > > On Wed, Dec 13, 2017 at 11:55 AM, Vadim Gindin > wrote: > > > Hi Michael, > > > > I've tried to implement such case but faced with the following problem. I > > recall, that my Query is combined with several ConstantScoreQuery with > > BooleanQuery. I wrote custom Collector as follows: > > > > @Override > > public void setScorer(Scorer scorer) throws IOException { > > this.scorer = scorer; > > > > } > > > > @Override > > public void collect(int doc) throws IOException { > > System.out.println("doc=" + doc); > > diveIntoScorers(this.scorer); > > } > > > > and, when I'm diving recursively to child scorers I'm facing new > > UnsupportedOperationException error. It happens because of the following > > code in BooleanScorer: > > > > @Override > > public int score(LeafCollector collector, Bits acceptDocs, int min, > > int max) throws IOException { > > fakeScorer.doc = -1; > > collector.setScorer(fakeScorer); > > > > Later fakeScorer throws an Exception. > > > > How did you implement your similar functionality? > > How to avoid this? > > > > Thanks, > > Vadim Gindin > > > > On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin > wrote: > > > > > Thank's for your help. I'll try that. > > > > > > On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev > > wrote: > > > > > >> Vadim, > > >> You can create a collector which checks Scorer.getChildren() > > >> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way > > >> cumbersome. > > >> I'd suggest to avoid this if it's possible. However, Elastic does > > >> something > > >> like this with named queries or so. > > >> I've told about this few years ago > > >> https://www.youtube.com/watch?v=sGVyUdNGBgw > > >> > > >> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin > > >> wrote: > > >> > > >> > I'm not sure here that I will be able to track somehow that > different > > >> terms > > >> > were matched to the same document... > > >> > > > >> > I'm thinking more about little another way: when query scores some > > >> document > > >> > - save the query term for that document somewhere. Probably it would > > be > > >> > some map in some class SearchContext. I could write something like > > this: > > >> > > > >> > SearchContext sc = getSearchContext(); // - does > > >> such > > >> > search context exist in Lucene? Maybe QueryContext > > >> > sc.getDocTerms().get(docID).add(query.getTerm())); // docTerms > here > > >> is a > > >> > Map> - where the key - is a document ID and the > > value > > >> - > > >> > is a list of terms by whom this document was matched. > > >> > > > >> > I need to save somewhere the document ID and the term matched that > > >> > document. Could somebody advise me an appropriate place? > > >> > > > >> > Regards, > > >> > Vadim Gindin > > >> > > > >> > > > >> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin > > > >> > wrote: > > >> > > > >> > > For example like this: > > >> > > > > >> > > BooleanQuery.Builder expected = new BooleanQuery.Builder(); > > >> > > > > >> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new > > >> > TermQuery(new Term("param_vendor", queryStr))), 5f); > > >> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new > > >> > TermQuery(new Term("param_model", queryStr))), 5f); > > >> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new > > >> > TermQuery(new Term("param_value", queryStr))), 3f); > > >> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new > > >> > TermQuery(new Term("param_name", queryStr))), 4f); > > >> > > > > >> > > BooleanQuery bq = expected > > >> > > .add(param_vendor, BooleanClause.Occur.SHOULD) > > >> > > .add(param_model, BooleanClause.Occur.SHOULD) > > >> > > .add(param_value, BooleanClause.Occur.SHOULD) > > >> > > .add(param_name, BooleanClause.Occur.SHOULD) > > >> > > .setMinimumNumberShouldMatch(1) > > >> > > .build(); > > >> > > > > >> > > return new BoostQuery(bq, queryBoost); > > >> > > > > >> > > > > >> > > Vadim > > >> > > > > >> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov < > msokolov@gmail.com > > > > > >> > > wrote: > > >> > > > > >> > >> Well how did you make the original query? > > >> > >> > > >> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" > > >> wrote: > > >> > >> > > >> > >> > Yes, thanks. My question is exactly about how to create > "another > > >> extra > > >> > >> > query that requires all the terms in the original query" > > >> > >> > > > >> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov < > > >> msokolov@gmail.com> > > >> > >> > wrote: > > >> > >> > > > >> > >> > > I'm just saying, that when you form your query, you could > also > > >> > create > > >> > >> > > another extra query that requires all the terms in the > original > > >> > query, > > >> > >> > and > > >> > >> > > then combine it with the original query in a boolean where > the > > >> > >> original > > >> > >> > > query is required and the extra query is optional. That will > > >> give a > > >> > >> boost > > >> > >> > > when all the terms are found, although I think the scores > will > > be > > >> > >> added, > > >> > >> > > not multiplied. > > >> > >> > > > > >> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" > > > >> > wrote: > > >> > >> > > > > >> > >> > > > Thanks, Michael! > > >> > >> > > > > > >> > >> > > > Yes, I'm sure. Could you explain your proposal in more > > detail? > > >> > >> > > > > > >> > >> > > > Regards, > > >> > >> > > > Vadim Gindin > > >> > >> > > > > > >> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov < > > >> > msokolov@gmail.com > > >> > >> > > > >> > >> > > > wrote: > > >> > >> > > > > > >> > >> > > > > You could combine a Boolean and query with the same > terms, > > >> as an > > >> > >> > > optional > > >> > >> > > > > clause. Are you sure about the requirement to multiply > the > > >> score > > >> > >> in > > >> > >> > > that > > >> > >> > > > > case? > > >> > >> > > > > > > >> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" < > > vgindin@detectum.com > > >> > > > >> > >> wrote: > > >> > >> > > > > > > >> > >> > > > > > Hi all. > > >> > >> > > > > > > > >> > >> > > > > > I need to track that all query terms are matched in one > > >> > >> document. > > >> > >> > > When > > >> > >> > > > > all > > >> > >> > > > > > terms are matched I need to multiply the score of such > > >> > document > > >> > >> to > > >> > >> > > some > > >> > >> > > > > > constant coefficient. > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > > > > >> > > > > >> > > > >> > > >> > > >> > > >> -- > > >> Sincerely yours > > >> Mikhail Khludnev > > >> > > > > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev > --001a114a930675b1fe056049bdae--