Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 91837 invoked from network); 14 May 2006 16:08:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 14 May 2006 16:08:11 -0000 Received: (qmail 89854 invoked by uid 500); 14 May 2006 16:08:07 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 89566 invoked by uid 500); 14 May 2006 16:08:05 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 89555 invoked by uid 99); 14 May 2006 16:08:05 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 May 2006 09:08:05 -0700 X-ASF-Spam-Status: No, hits=0.9 required=10.0 tests=HTML_10_20,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of beadygeraghty@gmail.com designates 66.249.82.197 as permitted sender) Received: from [66.249.82.197] (HELO wx-out-0102.google.com) (66.249.82.197) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 May 2006 09:08:04 -0700 Received: by wx-out-0102.google.com with SMTP id t14so542033wxc for ; Sun, 14 May 2006 09:07:43 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=n+akDa5t+/pj4k55ESs+CvHFCqpoXWplP/yf+poIj4fUGiM0nfeJj9Jok6L6uI6K+paGirsWJGr57ByH//xT9VQ4+AgCsCXmSfdsr3Cuz09byceNyLBiNPhMAw6txSIqz7eB8UPvgHVzkXgP+C99GZrGfEqKypkdyEQtqVT6AHI= Received: by 10.70.63.2 with SMTP id l2mr5614227wxa; Sun, 14 May 2006 09:07:43 -0700 (PDT) Received: by 10.70.53.13 with HTTP; Sun, 14 May 2006 09:07:43 -0700 (PDT) Message-ID: Date: Sun, 14 May 2006 12:07:43 -0400 From: "Beady Geraghty" To: java-user@lucene.apache.org Subject: out-of-memory when searching, paging does not work. MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_39229_26171347.1147622863324" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_39229_26171347.1147622863324 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline I have an out-of-memroy error when returning many hits. I am still on Lucene 1.4.3 I have a simple term query. It returned 899810 documents. I try to retrieve the name of each document and nothing else and I ran out of memory. Instead of getting the names all at once, I tried to query again after every 10,000 document. I close the index reader, index searcher, and the fsDir and re-query for every 10000 documents. This still doesn't work. >From another entry in the forum, it appears that the information about the hits that I have skipped over are still kept even though I don't access them. Am I understanding it correctly that if I start accessing from the 400000th documents onwards, some information about the 0-399999 documents are still cached even though I have skipped over those. Is there a way to get the file name (and perhaps other information) of the remaining documents ? (I tried a different term query that returned a hit size of 400000, and I was able to get the names of them all without re-quering) I think that I see someone mentioned about clearing the hit cache , though I don't how this is done. Thank you in advance for any hints on dealing with this. ------=_Part_39229_26171347.1147622863324--