Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 69085 invoked from network); 25 Feb 2008 08:05:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Feb 2008 08:05:39 -0000 Received: (qmail 27729 invoked by uid 500); 25 Feb 2008 08:05:27 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 27703 invoked by uid 500); 25 Feb 2008 08:05:27 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 27692 invoked by uid 99); 25 Feb 2008 08:05:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2008 00:05:27 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jlal@chambal.com designates 66.249.82.232 as permitted sender) Received: from [66.249.82.232] (HELO wx-out-0506.google.com) (66.249.82.232) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2008 08:04:54 +0000 Received: by wx-out-0506.google.com with SMTP id i28so2788706wxd.20 for ; Mon, 25 Feb 2008 00:04:59 -0800 (PST) Received: by 10.141.113.6 with SMTP id q6mr1807946rvm.135.1203926698834; Mon, 25 Feb 2008 00:04:58 -0800 (PST) Received: by 10.141.27.8 with HTTP; Mon, 25 Feb 2008 00:04:58 -0800 (PST) Message-ID: <967684dc0802250004r2f3869fey60229d2f302c7279@mail.gmail.com> Date: Mon, 25 Feb 2008 13:34:58 +0530 From: "Jawahar Lal" To: java-user@lucene.apache.org Subject: Out of Memory Exception MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_5526_28379673.1203926698834" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_5526_28379673.1203926698834 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi, I am using Lucene2.0. I am doing full text index of pdf file. To extract the text from pdf I am using pdfbox library. When I start indexing of pdf files I get Out of memory exception. This is becuase files are about 10 mb in size. I tried different value for mergefactor, maxmergefactor and maxbuffereddocs i.e. 100, 100; 100; 10, 100, 100; 100,100 ,1000; etc... I am storing the field value. I am not getting resolve this exception. Any suggestion to resolve the issue. Thanks ------=_Part_5526_28379673.1203926698834--