Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 686EE10BB5 for ; Thu, 26 Sep 2013 10:39:46 +0000 (UTC) Received: (qmail 95560 invoked by uid 500); 26 Sep 2013 10:38:38 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 95432 invoked by uid 500); 26 Sep 2013 10:38:34 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 95325 invoked by uid 99); 26 Sep 2013 10:38:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Sep 2013 10:38:26 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of uwe@thetaphi.de designates 188.138.97.18 as permitted sender) Received: from [188.138.97.18] (HELO mail.sd-datasolutions.de) (188.138.97.18) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Sep 2013 10:38:21 +0000 Received: from VEGA (gate1.marum.de [134.102.237.1]) by mail.sd-datasolutions.de (Postfix) with ESMTPSA id 864BD14AA0A8 for ; Thu, 26 Sep 2013 10:38:00 +0000 (UTC) From: "Uwe Schindler" To: References: <5242CA68.3020209@loot.co.za> <52440BB4.1080402@loot.co.za> In-Reply-To: <52440BB4.1080402@loot.co.za> Subject: RE: Lucene 4.4.0 mergeSegments OutOfMemoryError Date: Thu, 26 Sep 2013 12:38:00 +0200 Message-ID: <011201cebaa4$78f1abb0$6ad50310$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQINSPWaoB/+cb7HhpSzDW9iryolhAJA6b+BAxzI1z2ZL8kYsA== Content-Language: de X-Virus-Checked: Checked by ClamAV on apache.org Hi, TieredMergePolicy, which is the default since around Lucene 3.2, = prefers merging segments with many deletions, so forceMerge(1) is not = needed. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Michael van Rooyen [mailto:michael@loot.co.za] > Sent: Thursday, September 26, 2013 12:26 PM > To: java-user@lucene.apache.org > Cc: Ian Lea > Subject: Re: Lucene 4.4.0 mergeSegments OutOfMemoryError >=20 > Yes, it happens as part of the early morning optimize, and yes, it's a > forceMerge(1) which I've disabled for now. >=20 > I haven't looked at the persistence mechanism for Lucene since 2.x, = but if I > remember correctly, the deleted documents would stay in an index = segment > until that segment was eventually merged. Without forcing a merge > (optimize in old versions), the footprint on disk could be a multiple = of the > actual space required for the live documents, and this would have an = impact > on performance (the deleted documents would clutter the buffer cache). >=20 > Is this still the case? I would have thought it good practice to = force the dead > space out of an index periodically, but if the underlying storage = mechanism > has changed and the current index files are more efficient at = housekeeping, > this may no longer be necessary. >=20 > If someone could shed a little light on best practice for indexes = where > documents are frequently updated (i.e. deleted and re-added), that = would > be great. >=20 > Michael. >=20 >=20 > On 2013/09/26 11:43 AM, Ian Lea wrote: > > Is this OOM happening as part of your early morning optimize or at > > some other point? By optimize do you mean = IndexWriter.forceMerge(1)? > > You really shouldn't have to use that. If the index grows forever > > without it then something else is going on which you might wish to > > report separately. > > > > > > -- > > Ian. > > > > > > On Wed, Sep 25, 2013 at 12:35 PM, Michael van Rooyen > wrote: > >> We've recently upgraded to Lucene 4.4.0 and mergeSegments now > causes > >> an OOM error. > >> > >> As background, our index contains about 14 million documents = (growing > >> slowly) and we process about 1 million updates per day. It's about > >> 8GB on disk. I'm not sure if the Lucene segments merge the way = they > >> used to in the early versions, but we've always optimized at 3am to > >> get rid of dead space in the index, or otherwise it grows forever. > >> > >> The mergeSegments was working under 4.3.1 but the index has grown > >> somewhat on disk since then, probably due to a couple of added > >> NumericDocValues fields. The java process is assigned about 3GB = (the > >> maximum, as it's running on a 32 bit i686 Linux box), and it still = goes OOM. > >> > >> Any advice as to the possible cause and how to circumvent it would = be > great. > >> Here's the stack trace: > >> > >> org.apache.lucene.index.MergePolicy$MergeException: > >> java.lang.OutOfMemoryError: Java heap space > >> > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeExceptio > n > >> (ConcurrentMergeScheduler.java:545) > >> > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co > nc > >> urrentMergeScheduler.java:518) Caused by: > java.lang.OutOfMemoryError: > >> Java heap space > >> > org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNume > r > >> ic(Lucene42DocValuesProducer.java:212) > >> > org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeri > >> c(Lucene42DocValuesProducer.java:174) > >> > org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCor > eR > >> eaders.java:301) > >> > org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.j > av > >> a:253) > >> > org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.jav > a:2 > >> 15) > >> > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:119) > >> > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772 > >> ) > >> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376) > >> > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Concurrent > Me > >> rgeScheduler.java:405) > >> > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(Co > nc > >> urrentMergeScheduler.java:482) > >> > >> > >> Thanks, > >> Michael. > >> > >> = --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-user-help@lucene.apache.org > >> > > = --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org