From java-user-return-63799-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Wed Jun 27 04:51:22 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id EF5BA180636 for ; Wed, 27 Jun 2018 04:51:21 +0200 (CEST) Received: (qmail 79114 invoked by uid 500); 27 Jun 2018 02:51:20 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 79092 invoked by uid 99); 27 Jun 2018 02:51:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jun 2018 02:51:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 30B9418034B for ; Wed, 27 Jun 2018 02:51:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.399 X-Spam-Level: X-Spam-Status: No, score=0.399 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_NUMSUBJECT=0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=trypticon.org Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OgrPVnUipMAb for ; Wed, 27 Jun 2018 02:51:17 +0000 (UTC) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id A27365F33E for ; Wed, 27 Jun 2018 02:51:17 +0000 (UTC) Received: by mail-lj1-f174.google.com with SMTP id t22-v6so357461ljc.11 for ; Tue, 26 Jun 2018 19:51:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=trypticon.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=MPt/vGUADGJKbStinZdvb9oMlVIuJ/N4TV5TIS75hyc=; b=JSgRMpWipJe0mUH80SieXr6mdzP42XGzBIeqyvd/QqlpiRohKHhByR2um/rt7NP+AZ Izc6qWHJYtyic4/Vog9Bq4isC9H8dvHs7/BndG5YUswIEFWF26MZrBpIm6pXZoWLkec4 8OZg2yeQi7UKXBUw/DH9UZkZTlh3mNl60U0z2P8mzW9XNz7ds8SMktZ9WYaX1Z5ZKxMS mg9VbXXjV0fSXFaotZmtv3jrt8zCBRxRrpKq4YonCpDEmxIBaOVCjZBwXnMHm8ARa93l sly4/BkqhnUZQ/me/KAQqNvMxFvvI1KyiuV0wpi22O58l/ARnnHyorE0VuXQ9lzlpCBJ SmgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=MPt/vGUADGJKbStinZdvb9oMlVIuJ/N4TV5TIS75hyc=; b=WZ18YjaRHUIu/gmrWnzPeJNMTNTbPyjFcTc+tQ/TGT2G04J1KFbNn1ypiRnTXFnl9x R0ZkZ0Qqm2UT/sOOfaFuvbEb+RnIm/a+AKsdOelhaAeuHHmF27071U141syvVLpx75XQ Fn85tUDwR2yDqbqXY0GQGiXUkz00OeXnwjf4/kk1UwUrr3GnF46PD8/eWyQN13OjDltv zy1eRD//KCFo8pK8jSrQX6/AUDE+BjHhhY3Eb00BCPjuZt8WMtAH/sUOjmAz+SaEct+A fPtOD8Sl1VL7dTNqbdykBc+wWwi5Z7FydSLVE4lbAkVfneS5wK/X1Ds1NlgfpKfBiFcx 140Q== X-Gm-Message-State: APt69E0YZD2f3WZzdrkcAYdiltQA5qJilj2cTSfUogAbpaVA37Km2vRQ Tft8fhPEu4Bl9TK7MjaEEIcq7lbB X-Google-Smtp-Source: AAOMgpd+zz+txmTRZ6yRpZa7WAh3kWo7k6lL2kKknroN/Xa03KGGccM6n/82AjMiHyc9e19+2YWLYg== X-Received: by 2002:a2e:5d86:: with SMTP id v6-v6mr2592520lje.137.1530067871114; Tue, 26 Jun 2018 19:51:11 -0700 (PDT) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com. [209.85.215.43]) by smtp.gmail.com with ESMTPSA id u16-v6sm612056lff.2.2018.06.26.19.51.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jun 2018 19:51:10 -0700 (PDT) Received: by mail-lf0-f43.google.com with SMTP id f194-v6so202608lff.0 for ; Tue, 26 Jun 2018 19:51:09 -0700 (PDT) X-Received: by 2002:a19:a78a:: with SMTP id q132-v6mr2872117lfe.126.1530067869523; Tue, 26 Jun 2018 19:51:09 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:7402:0:0:0:0:0 with HTTP; Tue, 26 Jun 2018 19:51:08 -0700 (PDT) In-Reply-To: <7ff8416f0d0843fb82c14f2c5e8fed21@SYSRV214.exdom01.lan> References: <7ff8416f0d0843fb82c14f2c5e8fed21@SYSRV214.exdom01.lan> From: Trejkaz Date: Wed, 27 Jun 2018 11:51:08 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Efficient way to define large Boolean Occur.FILTER clause in Lucene 6 To: Lucene Users Mailing List Content-Type: text/plain; charset="UTF-8" On Tue, Jun 26, 2018 at 7:02 PM, Hasenberger, Josef wrote: > However, I have a feeling that the conversion from Long values to Terms is > rather inefficient for large collections and also uses a lot of memory. > To ease conversion overhead somewhat, I created a class that converts a > Long value directly to BytesRef instance (in order to avoid conversion to > UTF16 and then UTF8 again) and pass that instance to the Term constructor. First thought is, why are you using TermsQuery if they're in DocValues? Is DocValuesTermsQuery any better? It does depend on how many terms you're searching for. Second thought is that there is also DocValuesNumbersQuery, which avoids having to convert all the values. > I just wonder if there is a better method for passing large amount of filter criteria > to a BooleanQuery Occur.FILTER clause, that avoids excessive object creation. If you can get your long values into something which implements Bits, you could make a query using RandomAccessWeight to directly point at the existing set you already have in memory. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org