Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AEF9C200D6B for ; Sun, 31 Dec 2017 13:06:18 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A2C6F160C24; Sun, 31 Dec 2017 12:06:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C01E3160C09 for ; Sun, 31 Dec 2017 13:06:17 +0100 (CET) Received: (qmail 96676 invoked by uid 500); 31 Dec 2017 12:06:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 96664 invoked by uid 99); 31 Dec 2017 12:06:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 Dec 2017 12:06:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 84CD5C0A55 for ; Sun, 31 Dec 2017 12:06:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.98 X-Spam-Level: * X-Spam-Status: No, score=1.98 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=detectum-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id l9-dM0ovarHN for ; Sun, 31 Dec 2017 12:06:13 +0000 (UTC) Received: from mail-qk0-f170.google.com (mail-qk0-f170.google.com [209.85.220.170]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id EBAA95F474 for ; Sun, 31 Dec 2017 12:06:12 +0000 (UTC) Received: by mail-qk0-f170.google.com with SMTP id g81so27354536qke.1 for ; Sun, 31 Dec 2017 04:06:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=detectum-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=76QHKNT755bZYncejtc35egTU/SHxsAIy2YVX/A/r5w=; b=N14xPM5zBOeiduuA7wpAgdoww1wGydze8R2uOCPQeOxIFJlwSBdIhCN/K9Mrh2JHQZ uOyXF+SLMieEhXTvWFn5dDVSxpEOvC4/65sfX+ZGaqrhoFAgbomxEF9DKLy/1HOjrrSn MjAYaJ0sGCatj+cIqwY6AUNsag78lJVJdFeHSy49F5oeDDvn0D61IXXlHvGc8uJLFO8K V6/uB9z4NR8ieJSQzgapSe9r6kgItqZbslnTNYv44jC8ZHnTgeAqKcbjHvsJR/nObuZy MZWNp33graXCrUHrhvFi0s4ncAFl8N+oURRDkY827s0/c9o7hgzsNJ2WWO/Btcg43ppE B2hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=76QHKNT755bZYncejtc35egTU/SHxsAIy2YVX/A/r5w=; b=fp8UZH05FaRxejSaiwGCTX2NTp15CuZXCdAt7nJNUbq/jUYsOcVq5llaBKzwHEo3nH 9BDHICRdDWWvmIxTxtTZNuv9mwg4N3sIVNIZ6c8rDHfw5WZnlhgEXPjUM87vunlDd4aK GpDjadmQAQ6sqiu6S+J+Vrr4WfFZP2yWYuaDmQZhPzU3WhIeAKeERAjcQBrmA7N2UVFm DgtgCqIbjxxDuFtvQqYLaRf9MP0/cDZ2Mi+20cc12A4UwpZVLCf7nKkHUTjrroecMz4h A/zmL4n1bedYE0KJydai2cOthQWvNgAJarwKhXe7cnZ2Zk9ZjyYfxarvMX5oCtwXkxYh qHtw== X-Gm-Message-State: AKGB3mJ6RRaIrskrOveSEtjFvA0CJd8c5f5/KuLODUHPq25aNO3KHWkt oDX8f6Zk8dekDwYVMu3cZvziod3f51ApQooMBtUBqA== X-Google-Smtp-Source: ACJfBoucMG3ddmE2VgIvcuYeKUGI9YanEL7viQSWV8JYtt3B5N1RsoVeZx5neJJTGXoZLRx2icpla0jyc2BkVPnArHc= X-Received: by 10.233.220.197 with SMTP id q188mr48243qkf.72.1514721972379; Sun, 31 Dec 2017 04:06:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.88.16 with HTTP; Sun, 31 Dec 2017 04:06:11 -0800 (PST) Received: by 10.140.88.16 with HTTP; Sun, 31 Dec 2017 04:06:11 -0800 (PST) In-Reply-To: References: From: Vadim Gindin Date: Sun, 31 Dec 2017 17:06:11 +0500 Message-ID: Subject: Re: Query in a doc context To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary="94eb2c043cfc18f6c30561a1b1ba" archived-at: Sun, 31 Dec 2017 12:06:18 -0000 --94eb2c043cfc18f6c30561a1b1ba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks Mikhail! I'll look there. Happy new year ) Regards Vadim Gindin 31 =D0=B4=D0=B5=D0=BA. 2017 =D0=B3. 2:21 =D0=BF=D0=BE=D0=BB=D1=8C=D0=B7=D0= =BE=D0=B2=D0=B0=D1=82=D0=B5=D0=BB=D1=8C "Mikhail Khludnev" =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB: > Literally it's done in Solr (excuse moi) via > q=3Dfield1:(foo bar baz)^=3D3 field2:(foo bar baz)^=3D4 field3:(foo bar b= az)^=3D5 > but it's absolutely wrong way to approach the problem, you can find disma= x > and white elephant problem in the Relevant Search by Mr Turnbull > > On Tue, Dec 26, 2017 at 10:01 PM, Vadim Gindin > wrote: > > > Mike, > > > > I need the following. I want to create a query using the following > > information: query string "blah blah blah" and constant scores map: > > > > "field1" -> 3.0 > > "field2" -> 4.0 > > "field3" -> 5.0 > > > > // field1, field2, field3 - fields in the index. > > > > The created query should search "blah blah blah" in each specified fiel= d. > > If the search string is found in field1 then query score would be 3.0, > > field2 -> 4.0 and so on. The final score would be a sum of fields where > the > > search string is found. > > > > I've implemented that and additional things: like explanation extending > and > > composing sum scores. > > > > Regards, > > Vadim Gindin > > > > > > On Fri, Dec 15, 2017 at 10:33 PM, Mike Dinescu (DNQ) > > > wrote: > > > > > Got it. I misunderstood the question (actually I'm still not convince= d > I > > > fully understand what you're looking for). It might be good to give a= n > > > example in case others on the mailing list are confused. > > > > > > *Mike* > > > > > > > > > > > > On Thu, Dec 14, 2017 at 8:54 AM, Vadim Gindin > > > wrote: > > > > > > > Mike, > > > > > > > > I don't need full doc match. I need a multi-field match and later I > > need > > > to > > > > know - what fields are matched for a document to be able to calcula= te > > > other > > > > multi-fields-oriented metrics. > > > > > > > > Regards, > > > > Vadim Gindin > > > > > > > > On Thu, Dec 14, 2017 at 8:46 PM, Mike Dinescu (DNQ) < > > mdinescu@donaq.com> > > > > wrote: > > > > > > > > > Apologies if I completely misundetstood but if you are looking to > do > > a > > > > full > > > > > doc match, you could duplicate duplicated the doc into another > field > > > that > > > > > is a true full text index of the document. > > > > > > > > > > And search on that. Wouldn't that be exactly what you want? > > > > > > > > > > On Thu, Dec 14, 2017 at 6:53 AM Vadim Gindin > > > > > wrote: > > > > > > > > > > > Thanks Mikhail > > > > > > > > > > > > Could you describe your sentences in more detail? > > > > > > > > > > > > Vadim > > > > > > > > > > > > On Thu, Dec 14, 2017 at 7:08 PM, Mikhail Khludnev < > mkhl@apache.org > > > > > > > > wrote: > > > > > > > > > > > > > Hello, Vadim. > > > > > > > > > > > > > > Please find inline. > > > > > > > > > > > > > > On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin < > > > vgindin@detectum.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all. > > > > > > > > > > > > > > > > As I can understand. All Queries (or most of them?) are > > > > single-field > > > > > > > > oriented. They may implement different search/score logic, > but > > > they > > > > > are > > > > > > > > intended for a single field. For example, simple TermQuery = or > > > > > > > PhraseQuery. > > > > > > > > If I need to implement the search through different fields = I > > > should > > > > > use > > > > > > > > BooleanQuery to combine several single-field queries. > > > > > > > > > > > > > > > > Did I understand that right? > > > > > > > > > > > > > > > > > > > > > > Absolutely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What is an appropriate way to implement a document-wise > Query? > > > > > > > > > > > > > > > > 1. DisjunctionScorer.getChildren() painful doc-at-time > > handling > > > > > > > 2. there is a quite promising idea is to amend buffer in > > > term-at-time > > > > > > > BooleanScorer to track every doc-term hit. > > > > > > > 3. probably it can be done by copying all terms into single > > field, > > > > but > > > > > > > storing original field in payloads, but it's reaalllly sloooo= ww > > > > > > > > > > > > > > > > > > > > > > I need to have the ability to combine fields matching of on= e > > > > document > > > > > > and > > > > > > > > analyze it. Particularly - to count whether all query terms > are > > > > > matched > > > > > > > (to > > > > > > > > one field or to different fields). I need to be able to fet= ch > > > > > > > corresponding > > > > > > > > information: what terms are matched to what fields and so o= n. > > > > > > > > > > > > > > > > > > > > > > > > It seems, that BooleanQuery/BooleanScorer is not a good pla= ce > > to > > > > > > > accumulate > > > > > > > > some information from a child Queries/Scorers. > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sincerely yours > > > > > > > Mikhail Khludnev > > > > > > > > > > > > > > > > > > -- > > > > > *Mike Dinescu* > > > > > Donaq LLC, Founder > > > > > +1 (312) 924 0600 > > > > > www.donaq.com > > > > > http://linkedin.com/company/donaq-llc > > > > > > > > > > > > > > > *CONFIDENTIAL COMMUNICATION:* This message is intended only for t= he > > > named > > > > > recipient(s) above. It may contain confidential information that = is > > > > > privileged or that constitutes work product of Donaq LLC. If you > are > > > not > > > > > the intended recipient, you are hereby notified that any > > dissemination, > > > > > distribution or copying of this e-mail and any attachment(s) is > > > strictly > > > > > prohibited. > > > > > > > > > > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev > --94eb2c043cfc18f6c30561a1b1ba--