Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 60C5C200CC6 for ; Tue, 18 Jul 2017 22:48:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5F735167953; Tue, 18 Jul 2017 20:48:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A37E8167951 for ; Tue, 18 Jul 2017 22:48:05 +0200 (CEST) Received: (qmail 84790 invoked by uid 500); 18 Jul 2017 20:48:04 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Delivered-To: moderator for java-user@lucene.apache.org Received: (qmail 1156 invoked by uid 99); 18 Jul 2017 20:14:21 -0000 X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.63 X-Spam-Level: ** X-Spam-Status: No, score=2.63 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, MIME_QP_LONG_LINE=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:date:subject:message-id :references:to; bh=Y6vsJsIsWfi7nGTYwcVb46RVocxVmKNu6mVPEGmTsDo=; b=vBk6VqHA0VYsBCsk4ghZJCqqOQug0bGyjyxYoXW0/bcqgnksHYBbMKcuupWWNJUJTE 6HZqWYW+0fdnSYSwaPC7aasJkMcEYV9tpL+chb0+k0yXdFn24YZtCJnyNggpIJP5E76z 0L5+8tzJ3RgJ06XoHXoRxxDElxcyffcowqfE7yDL1O1bISSLAknwsrv2QgCHldf2SdTu 902XrdCHWICIQax70jbcYkcXXOPnKAjZSu4p/H7xohATt5AUL2JYvKBFl8syOn+j45ii /ifOuLbKb/9oAKevh83nkxpswP4rqnzaptwA2Gr8tffLHO0kFUgHmk6/truUFsX+dJ+l h0ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version:date :subject:message-id:references:to; bh=Y6vsJsIsWfi7nGTYwcVb46RVocxVmKNu6mVPEGmTsDo=; b=cquVZc0npYWYAsVtQUDnxevu43txQsF8sKS+hQ3y3kD056RqS4SymaD37erondXLrE Ddyf/N/IO50/QonYphmxJ8DDG0z4A56XiR5d+A9b53pI8J+x6m74U4w/diX5rA3nITGe fnHE30o847iX3v9qv7FuBC91nd91H0t2PZWjYlt56rsvdQk/aA9Xeki/g8Oc+7+2pnlK 9HwP9QehFOswOQUzvwoqps3ntAAR4Dhrf2QqmeT2bE6jZ2X1tqohGypndX3/ZapDr1/5 2xGzrAaedaypqQNrbiE2xLnIZ4S4RKszOAS3GUI/U43xmWtLdZvh+ufXVn8ML8peoFhY nOdQ== X-Gm-Message-State: AIVw113vgE7KshjBCeZhxx9oy2ckQIpB0yGkrBAdfDpPJ5V4fGZhBZxe lfionkE00mhEdiUXDk4= X-Received: by 10.200.39.24 with SMTP id g24mr4580237qtg.163.1500408853406; Tue, 18 Jul 2017 13:14:13 -0700 (PDT) From: Rilpa Jain Content-Type: multipart/alternative; boundary=Apple-Mail-98611BF1-4EF3-40EE-9D84-94E402EF97DE Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (1.0) Date: Tue, 18 Jul 2017 16:14:11 -0400 Subject: Migration to Lucene 6.5 - Filters to Queries Message-Id: <7039874B-25E2-414C-9965-BACB991EA1B3@gmail.com> References: To: java-user@lucene.apache.org X-Mailer: iPhone Mail (14F89) archived-at: Tue, 18 Jul 2017 20:48:06 -0000 --Apple-Mail-98611BF1-4EF3-40EE-9D84-94E402EF97DE Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > Hi, > =20 > We plan to migrate from lucene 5.5 to 6.5. We have been using DocValuesTer= msFilter extensively which was deprecated in Lucene 5.5 and removed in Lucen= e 6.0. > The Javadoc specifies to use DocValuesTermsQuery and BoolenaClause.Occur.Fi= lter instead. However, as per our local tests, the time taken to search docu= ments has increased with this change. > =20 > Below is one of the scenarios in our application - > We do a search within a search. > =20 > (Before migration to Lucene 5.5) > The first search is on a text field with discrete values. (There is no pat= tern to the value of this text field. Here the terms[] ranges from 1 to 200k= in size.) =E2=80=93 We use DocValuesTermsFilter and pass it is as Filter p= arameter to search method. > The second search is on result of step 1- This could be either a TermQuery= or NumericRangeQuery, evaluated to query and added as query parameter to se= arch method. > =20 > (After migration to Lucene 6.5) > The first search is on a text field with discrete values. (There is no pat= tern to the value of this text field. Here the terms[] ranges from 1 to 200k= in size.) =E2=80=93 We use DocValuesTermsQuery and add it to BooleanQuery w= ith Occur.Filter. > The second search is on result of step 1- This could be either a TermQuery= or NumericRangeQuery added to BooleanQuery with Occur.MUST. > The booleanQuery is build and passed to search method. > =20 > This query execution after migration takes 5x-10x times more as compared t= o using DocValuesTermsFilter. > =20 > Is there a better class to generate query in our scenario than the one use= d above? Or is there anything that I am missing? > Any insights would help! Thanks. --Apple-Mail-98611BF1-4EF3-40EE-9D84-94E402EF97DE--