Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 15028 invoked from network); 12 Mar 2009 20:48:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Mar 2009 20:48:48 -0000 Received: (qmail 89980 invoked by uid 500); 12 Mar 2009 20:48:45 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 89945 invoked by uid 500); 12 Mar 2009 20:48:45 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 89913 invoked by uid 99); 12 Mar 2009 20:48:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Mar 2009 13:48:45 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 216.239.58.189 as permitted sender) Received: from [216.239.58.189] (HELO gv-out-0910.google.com) (216.239.58.189) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Mar 2009 20:48:35 +0000 Received: by gv-out-0910.google.com with SMTP id n29so553546gve.23 for ; Thu, 12 Mar 2009 13:48:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=dFJ5vyW9SowCJGNFOl2jDzKdIcrl3q8G1R8mEGQfEmE=; b=j7wT00xzya/Hfh+4Aen78k4zG2ONpjftRA6bgfU+e6N7DL/ujEn+cR+yKuaH1iGMTW cPBG3UAyWYXQHrvk8Oy7ksaj6HV7yQltNHCKNh7TRXRb4blTOTbtKbXwXLA9/A+X8BWF MBPuuoS7V2bKEwFfvAQpJe4L+7f/ohDlR2mv4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=AlbpSUFJhCw9WU1JFtWMtYESDxuXXYkHfW6P0MrP9QauO/jtrzBZtOGtAA/0Q1yW9i OK8Zs+QYL0akRF4zfk2ThVCIDMmbjNzCG+9SPklpQ51y44Od+zM4sJcqsThokhJk4/qb mwFCXGPeP+QcSzC9JYtvBOPxo8GlNNfO6lcaE= MIME-Version: 1.0 Received: by 10.142.132.2 with SMTP id f2mr174581wfd.261.1236890893585; Thu, 12 Mar 2009 13:48:13 -0700 (PDT) In-Reply-To: <20090312143009.9ute77bb400kkcok@webmail.digiatlas.org> References: <20090305091642.tn2pa2480gokgwg8@webmail.digiatlas.org> <20090309142553.owuism4vsw0wwwg8@webmail.digiatlas.org> <20090311132002.fxsszjida8488o8c@webmail.digiatlas.org> <4786E70C-D416-4D21-BFFC-90667AD4002B@mikemccandless.com> <20090312143009.9ute77bb400kkcok@webmail.digiatlas.org> Date: Thu, 12 Mar 2009 13:48:13 -0700 Message-ID: Subject: Re: problems with large Lucene index From: Ted Dunning To: general@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd179c4568ce60464f21acc X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd179c4568ce60464f21acc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit try running with verbose gc. That will give you more details about what is happening. Even better, run with jconsole on the side so that you get really detailed information on memory pools. On Thu, Mar 12, 2009 at 7:30 AM, wrote: > Thanks Mike and Jokin for your comments on the memory problem. I have > submitted the query to the Hibernate Search list although I haven't seen a > response yet. > > In the meantime I did my own investigating in the code (I'd rather have > avoided this!). I'm seeing results that don't make any sense and maybe > someone here with more experience of Lucene and the way memory is allocated > by the JVM may shed light on, what to me, are quite illogical observations. > > As you may recall I had a stand-alone Lucene search and a Hibernate Search > version. Looking in the HS code did not shed any light on the issue. I took > my stand-alone Lucene code and put it in a method and replaced the search in > the HS class (the constructor of QueryHits.java) with the call to my method. > Bear in mind this method is the same code as posted in my earlier message - > it sets up the Lucene search from scratch (i.e.: no data structures created > by HS were used). So, effectively I was calling my stand-alone code after > any setup done by Hibernate and any memory it may have allocated (which > turned out to be a few Mb). > > I get OOM! Printing the free memory at this point shows bags of memory > left. Indeed, the same free memory (+/- a few Mb) as the stand-alone Lucene > version! > > I then instrumented the Lucene method where the OOM is occuring > (FSDirectory.readInternal()). I cannot understand the results I am seeing. > Below is a snippet of the output of each with the code around FSDirectory > line 598 as follows: > > ... > do { > long tot = Runtime.getRuntime().totalMemory(); > long free =Runtime.getRuntime().freeMemory(); > System.out.println("LUCENE: offset="+offset+" total="+total+" > len-total="+(len-total)+" free mem="+free+" used ="+(tot-free)); > int i = file.read(b, offset+total, len-total); > ... > > > > The stand-alone version: > > ... > LUCENE: offset=0 total=0 len-total=401 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=883 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=1024 free mem=918576864 used > =330080544 > LUCENE: offset=0 total=0 len-total=209000000 free mem=631122912 used > =617534496 > LUCENE: offset=209000000 total=0 len-total=20900000 free mem=631122912 > used =617534496 > LUCENE: offset=229900000 total=0 len-total=20900000 free mem=631122912 > used =617534496 > LUCENE: offset=250800000 total=0 len-total=20900000 free mem=631122912 > used =617534496 > ... > completes successfully! > > > The method called via Hibernate Search: > > ... > LUCENE: offset=0 total=0 len-total=401 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=883 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=1024 free mem=924185480 used > =334892152 > LUCENE: offset=0 total=0 len-total=209000000 free mem=636731528 used > =622346104 > Exception in thread "main" java.lang.OutOfMemoryError > at java.io.RandomAccessFile.readBytes(Native Method) > at java.io.RandomAccessFile.read(Unknown Source) > at > org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal(FSDirectory.java:599) > ... fails with exception! > > > Note that the HS version has slightly more free memory because I ran it > with -Xms1210M as opposed to -Xms1200M for the stand-alone to offset any > memory used by HS when it starts up. > > As you can see, these are identical for all practical purposes. So what > gives? > > I'm stumped, so any suggestions appreciated. > > Thanks. > > > Quoting Michael McCandless : > > >> Unfortunately, I'm not familiar with exactly what Hibernate search does >> with the Lucene APIs. >> >> It must be doing something beyond what your standalone Lucene test case >> does. >> >> Maybe ask this question on the Hibernate list? >> >> Mike >> > > > -- Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 www.deepdyve.com 408-773-0110 ext. 738 858-414-0013 (m) 408-773-0220 (fax) --000e0cd179c4568ce60464f21acc--