Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F10D718E1C for ; Fri, 13 Nov 2015 17:05:43 +0000 (UTC) Received: (qmail 7985 invoked by uid 500); 13 Nov 2015 17:05:42 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 7926 invoked by uid 500); 13 Nov 2015 17:05:42 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7915 invoked by uid 99); 13 Nov 2015 17:05:41 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Nov 2015 17:05:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 73EF5C0052 for ; Fri, 13 Nov 2015 17:05:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.799 X-Spam-Level: X-Spam-Status: No, score=0.799 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Nm18xnDFT2gX for ; Fri, 13 Nov 2015 17:05:31 +0000 (UTC) Received: from mail.sd-datasolutions.de (serv2.sd-datasolutions.de [85.25.204.22]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id D2B20203C1 for ; Fri, 13 Nov 2015 17:05:30 +0000 (UTC) Received: from VEGA (unknown [IPv6:2001:1a80:2b04:a01:8e70:5aff:fed1:75a4]) by mail.sd-datasolutions.de (Postfix) with ESMTPSA id 4C77316F802C9 for ; Fri, 13 Nov 2015 17:05:29 +0000 (UTC) X-NSA-Greeting: Dear NSA, have fun with reading and analyzing this e-mail! From: "Uwe Schindler" To: References: In-Reply-To: Subject: RE: debugging growing index size Date: Fri, 13 Nov 2015 18:05:29 +0100 Message-ID: <00f401d11e35$7fb4cb70$7f1e6250$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQLJbiYimz0xvxxtgYjgqQrnFi6exQIOxl1tAZhMIh8BkC25AgHBsGz/nHHG/XA= Content-Language: de Did you disable unmapping using MMapDirectory#setEnableUnmap() ? By = default it should be enabled, but maybe you disabled it for some reason? Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Rob Audenaerde [mailto:rob.audenaerde@gmail.com] > Sent: Friday, November 13, 2015 5:24 PM > To: java-user@lucene.apache.org > Subject: Re: debugging growing index size >=20 > I'm currently running using NIOFS. It seems to prevent the issue from > appearing. >=20 > This is a second run (with applied deletes etc) >=20 > raudenaerd@:/<6>index/index$sudo ls -lSra *.dvd > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 _y_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 _xod_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 = _110e_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 = _12r5_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 _y0s_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 _z20_Lucene50_0.dvd >=20 > raudenaerde:/<6>index/index$sudo ls -lSaa *.dvd > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 _z20_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 _y0s_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 = _12r5_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 = _110e_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 _xod_Lucene50_0.dvd > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 _y_Lucene50_0.dvd >=20 >=20 >=20 > On Thu, Nov 12, 2015 at 3:40 PM, Michael McCandless < > lucene@mikemccandless.com> wrote: >=20 > > Hi Rob, > > > > A couple more things: > > > > Can you print the value of MMapDirectory.UNMAP_SUPPORTED? > > > > Also, can you try your test using NIOFSDirectory instead? Curious = if > > that changes things... > > > > Mike McCandless > > > > http://blog.mikemccandless.com > > > > > > On Thu, Nov 12, 2015 at 7:28 AM, Rob Audenaerde > > wrote: > > > Curious indeed! > > > > > > I will turn on the IndexFileDeleter.VERBOSE_REF_COUNTS and = recreate > the > > > logs. Will get back with them in a day hopefully. > > > > > > Thanks for the extra logging! > > > > > > -Rob > > > > > > On Thu, Nov 12, 2015 at 11:34 AM, Michael McCandless < > > > lucene@mikemccandless.com> wrote: > > > > > >> Hmm, curious. > > >> > > >> I looked at the [large] infoStream output and I see segment _3ou7 > > >> present on init of IW, a few getReader calls referencing it, then = a > > >> forceMerge that indeed merges it away, yet I do NOT see IW > attempting > > >> deletion of its files. > > >> > > >> And indeed I see plenty (too many: many times per second?) of > commits > > >> after that, so the index itself is no longer referencing _3ou7. > > >> > > >> If you are failing to close all NRT readers then I would expect = _3ou7 > > >> to be in the lsof output, but it's not. > > >> > > >> The NRT readers close method has logic that notifies IndexWriter = when > > >> it's done "needing" the files, to emulate "delete on last close" > > >> semantics for filesystems like HDFS that don't do that ... it's > > >> possible something is wrong here. > > >> > > >> Can you set the (public, static) boolean > > >> IndexFileDeleter.VERBOSE_REF_COUNTS to true, and then re-generate > this > > >> log? This causes IW to log the ref count of each file it's = tracking > > >> ... > > >> > > >> I'll also add a bit more verbosity to IW when NRT readers are = opened > > >> and close, for 5.4.0. > > >> > > >> Mike McCandless > > >> > > >> http://blog.mikemccandless.com > > >> > > >> > > >> On Wed, Nov 11, 2015 at 6:09 AM, Rob Audenaerde > > >> wrote: > > >> > Hi all, > > >> > > > >> > I'm still debugging the growing-index size. I think closing = index > > readers > > >> > might help (work in progress), but I can't really see them = holding on > > to > > >> > files (at least, using lsof ). Restarting the application sheds = some > > >> light, > > >> > I see logging on files that are no longer referenced. > > >> > > > >> > What I see is that there are files in the index-directory, that = seem > > to > > >> > longer referenced.. > > >> > > > >> > I put the output of the infoStream online, because is it rather = big > > (30MB > > >> > gzipped): http://www.audenaerde.org/lucene/merges.log.gz > > >> > > > >> > Output of lsof: (executed 'sudo lsof *' in the index directory = ). > > This > > >> is > > >> > on an CentOS box (maybe that influences stuff as well?) > > >> > > > >> > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE = NAME > > >> > java 30581 apache mem REG 253,0 3176094924 18880508 > > >> > _4gs5_Lucene50_0.dvd > > >> > java 30581 apache mem REG 253,0 505758610 18880546 = _4gs5.fdt > > >> > java 30581 apache mem REG 253,0 369563337 18880631 > > >> > _4gs5_Lucene50_0.tim > > >> > java 30581 apache mem REG 253,0 176344058 18880623 > > >> > _4gs5_Lucene50_0.pos > > >> > java 30581 apache mem REG 253,0 378055201 18880606 > > >> > _4gs5_Lucene50_0.doc > > >> > java 30581 apache mem REG 253,0 372579599 18880400 > > >> > _4i5a_Lucene50_0.dvd > > >> > java 30581 apache mem REG 253,0 82017447 18880748 = _4g37.cfs > > >> > java 30581 apache mem REG 253,0 85376507 18880721 = _4fb3.cfs > > >> > java 30581 apache mem REG 253,0 363493917 18880533 > > >> > _4ct1_Lucene50_0.dvd > > >> > java 30581 apache mem REG 253,0 9421892 18880806 = _4gjc.cfs > > >> > java 30581 apache mem REG 253,0 76877461 18880553 = _4ct1.fdt > > >> > java 30581 apache mem REG 253,0 46271330 18880661 > > >> > _4ct1_Lucene50_0.tim > > >> > java 30581 apache mem REG 253,0 26911387 18880653 > > >> > _4ct1_Lucene50_0.pos > > >> > java 30581 apache mem REG 253,0 54678249 18880568 > > >> > _4ct1_Lucene50_0.doc > > >> > java 30581 apache mem REG 253,0 76556587 18880328 = _4i5a.fdt > > >> > java 30581 apache mem REG 253,0 45032159 18880389 > > >> > _4i5a_Lucene50_0.tim > > >> > java 30581 apache mem REG 253,0 26486772 18880388 > > >> > _4i5a_Lucene50_0.pos > > >> > java 30581 apache mem REG 253,0 55411002 18880362 > > >> > _4i5a_Lucene50_0.doc > > >> > java 30581 apache mem REG 253,0 70484185 18880340 = _4hkn.cfs > > >> > java 30581 apache mem REG 253,0 10873921 18880324 = _4gpz.cfs > > >> > java 30581 apache mem REG 253,0 17230506 18880524 = _4i11.cfs > > >> > java 30581 apache mem REG 253,0 6706969 18880575 = _4i0t.cfs > > >> > java 30581 apache mem REG 253,0 15135578 18880624 = _4i0i.cfs > > >> > java 30581 apache mem REG 253,0 15368310 18880717 = _4hzp.cfs > > >> > java 30581 apache mem REG 253,0 5146140 18880583 = _4hze.cfs > > >> > java 30581 apache mem REG 253,0 2917380 18880411 = _4gs5.nvd > > >> > java 30581 apache mem REG 253,0 6871469 18880732 = _4hod.cfs > > >> > java 30581 apache mem REG 253,0 2860341 18880495 = _4i84.cfs > > >> > java 30581 apache mem REG 253,0 835726 18880660 = _4i7z.cfs > > >> > java 30581 apache mem REG 253,0 1005595 18880648 = _4i7w.cfs > > >> > java 30581 apache mem REG 253,0 5639672 18880401 = _4i4o.cfs > > >> > java 30581 apache mem REG 253,0 4388371 18880440 = _4i4a.cfs > > >> > java 30581 apache mem REG 253,0 1151845 18880512 = _4i7v.cfs > > >> > java 30581 apache mem REG 253,0 941773 18880613 = _4i7x.cfs > > >> > java 30581 apache mem REG 253,0 984023 18880588 = _4i7o.cfs > > >> > java 30581 apache mem REG 253,0 1790005 18880619 = _4i7y.cfs > > >> > java 30581 apache mem REG 253,0 466371 18880515 = _4ct1.nvd > > >> > java 30581 apache mem REG 253,0 723280 18880573 = _4i7q.cfs > > >> > java 30581 apache mem REG 253,0 806289 18880517 = _4i7h.cfs > > >> > java 30581 apache mem REG 253,0 17362 18880520 = _4i9s.cfs > > >> > java 30581 apache mem REG 253,0 698362 18880531 = _4i9r.cfs > > >> > java 30581 apache mem REG 253,0 483215 18880406 = _4i5a.nvd > > >> > java 30581 apache mem REG 253,0 14110 18880416 = _4i9v.cfs > > >> > java 30581 apache mem REG 253,0 6121 18880412 = _4i9t.cfs > > >> > java 30581 apache 30wW REG 253,0 0 18877901 = write.lock > > >> > > > >> > Output of some of the biggest files in the index directory: > > >> > > > >> > -rw-r--r--. 1 apache apache 358684577 Nov 11 08:04 _4fjn.cfs > > >> > -rw-r--r--. 1 apache apache 363493917 Nov 11 07:54 > > _4ct1_Lucene50_0.dvd > > >> > -rw-r--r--. 1 apache apache 369563337 Nov 11 08:06 > > _4gs5_Lucene50_0.tim > > >> > -rw-r--r--. 1 apache apache 372579599 Nov 11 08:09 > > _4i5a_Lucene50_0.dvd > > >> > -rw-r--r--. 1 apache apache 378055201 Nov 11 08:06 > > _4gs5_Lucene50_0.doc > > >> > -rw-r--r--. 1 apache apache 427401813 Nov 10 08:14 _3ou7.cfs > > >> > -rw-r--r--. 1 apache apache 505758610 Nov 11 08:04 _4gs5.fdt > > >> > -rw-r--r--. 1 apache apache 1107391579 Nov 10 07:55 > > _3k3a_Lucene50_0.dvd > > >> > -rw-r--r--. 1 apache apache 3176094924 Nov 11 08:10 > > _4gs5_Lucene50_0.dvd > > >> > > > >> > Note that the 3ou7 and 3k3a segments no longer appear to be in = use? > > >> > > >> = --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > >> > > > > = --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org