Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8018D180F0 for ; Mon, 8 Jun 2015 10:56:18 +0000 (UTC) Received: (qmail 24560 invoked by uid 500); 8 Jun 2015 10:56:17 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 24510 invoked by uid 500); 8 Jun 2015 10:56:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 24499 invoked by uid 99); 8 Jun 2015 10:56:16 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Jun 2015 10:56:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 543E1C0944 for ; Mon, 8 Jun 2015 10:56:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.1 X-Spam-Level: **** X-Spam-Status: No, score=4.1 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_COUK=1.1] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id OtonGCYGMr94 for ; Mon, 8 Jun 2015 10:56:01 +0000 (UTC) Received: from mail-wg0-f47.google.com (mail-wg0-f47.google.com [74.125.82.47]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 231DC209DC for ; Mon, 8 Jun 2015 10:56:00 +0000 (UTC) Received: by wgv5 with SMTP id 5so100092824wgv.1 for ; Mon, 08 Jun 2015 03:55:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:mime-version:content-type:subject:date :in-reply-to:to:references:message-id; bh=t36J2Zff61orGxlSj+HTJk8BCKwKytwON1hXjdWJGYA=; b=CliHfkj7ZnVSBXAaB0doDSuf6Qn+KVE9KIkYVQovCkXwZGMvGhMEO9KfB/y7Ic8PjA XMTV47U9+4dMJfenXgMAk2mx3pUmSoTfJYSkB7ybN7d3NNebChgGPVYrc8XoDrgqnBBx TKJ7TWXwcEVbElDObGe2zElVZLxjTs8WIRZXWFmyfBdbeEhJTNIMRuAlhWB8ZlRu2TtP NGqH9Hv3CfNXLkcyDDCjkONV9s7ZW9XTuRhUoWUrqIAbwIh89mIlNMbk1O8mV79Xv62k Li5ASSyTYaHgr9V1psZ+/AltrYCJnEeFqbZlzOd24NBDIP3vDKi6V924+H7goUd67ERY bG1w== X-Gm-Message-State: ALoCoQlCDIe5jydJ1kzrJkoZaFly6JTYvmPxchvbWNyfUlf56f99u9eqaWHLH4kwagOP3u7T4SAB X-Received: by 10.180.91.100 with SMTP id cd4mr20943633wib.57.1433760953258; Mon, 08 Jun 2015 03:55:53 -0700 (PDT) Received: from [192.168.0.7] (host-92-29-114-159.as13285.net. [92.29.114.159]) by mx.google.com with ESMTPSA id b7sm3658234wjx.47.2015.06.08.03.55.52 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 08 Jun 2015 03:55:52 -0700 (PDT) From: Alan Woodward Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/alternative; boundary="Apple-Mail=_CFD59DA1-3E19-43D1-A5E2-1676248A0022" Subject: Re: Memory problem with TermQuery Date: Mon, 8 Jun 2015 11:55:51 +0100 In-Reply-To: <26e6b9593bfd4e21b81d3d50ecc242bf@KIM02.tis.local> To: java-user@lucene.apache.org References: <87a2d61a59354b78a08e73cf35c9a53a@KIM02.tis.local> <144E8296-C46E-4053-A709-0E537404E7F7@flax.co.uk> <26e6b9593bfd4e21b81d3d50ecc242bf@KIM02.tis.local> Message-Id: <77870F57-E8BC-44EF-95CD-7EA34C213C77@flax.co.uk> X-Mailer: Apple Mail (2.1283) --Apple-Mail=_CFD59DA1-3E19-43D1-A5E2-1676248A0022 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii You'll still need to call rewrite, but it needs to be done per-reader, = so you'll need to cache the queries *before* they're rewritten, and then = call rewrite whenever you create a new IndexReader. Otherwise you'll = get incorrect scores, and possibly missed hits as well. Alan Woodward www.flax.co.uk On 8 Jun 2015, at 11:46, Anna Maier wrote: > Hi Alan, >=20 > you are right, we are calling rewrite on our query at some point. Ok, = it would probably be an option to take that out. > Thanks for the hint! >=20 > Best, > Anna >=20 > -----Original Message----- > From: Alan Woodward [mailto:alan@flax.co.uk]=20 > Sent: Montag, 8. Juni 2015 12:23 > To: java-user@lucene.apache.org > Subject: Re: Memory problem with TermQuery >=20 > Hi Anna, >=20 > In normal usage, perReaderTermState will be null, and TermQuery will = be very lightweight. It's in particular expert use cases (generally = after queries have been rewritten against a specific IndexReader) that = the perReaderTermState will be initialized. Are you cacheing rewritten = queries somehow? >=20 > Alan Woodward > www.flax.co.uk >=20 >=20 > On 8 Jun 2015, at 10:49, Anna Maier wrote: >=20 >> Hi, >>=20 >> we ran into a memory problem with TermQuery: in our program, we build = a TermQuery object from the user input and pass it around, to be able to = different things, like execute the query again and so on. So, the = TermQuery object can potentially exist for some time. >> Now it turns out, that a TermQuery keeps a reference to an = IndexReader (via the perReaderTermState field).=20 >> This keeps our program from throwing old readers away when new ones = are opened. This has quite an impact on the required memory, especially = for big indices. It is not feasible anymore now to keep a reference to a = TermQuery for longer. >>=20 >> I'm wondering: is this a bug? After all, I would have expected the = TermQuery to be a lightweight object. Or is the TermQuery not intended = to be passed around in the program at all?=20 >>=20 >> Best, >> Anna >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >>=20 >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org >=20 --Apple-Mail=_CFD59DA1-3E19-43D1-A5E2-1676248A0022--