Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 84489 invoked from network); 23 Aug 2004 05:23:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 23 Aug 2004 05:23:28 -0000 Received: (qmail 49546 invoked by uid 500); 23 Aug 2004 05:23:21 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 49524 invoked by uid 500); 23 Aug 2004 05:23:20 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 49504 invoked by uid 99); 23 Aug 2004 05:23:20 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [203.199.26.74] (HELO daakghar.controlnet.co.in) (203.199.26.74) by apache.org (qpsmtpd/0.27.1) with SMTP; Sun, 22 Aug 2004 22:23:19 -0700 Received: from karthik ([192.168.4.1]) by dakiya.controlnet.co.in (Netscape Messaging Server 4.15) with ESMTP id I2VWX400.Q8S for ; Mon, 23 Aug 2004 11:06:40 +0530 From: "Karthik N S" To: "Lucene Users List" Subject: RE: pdfboxhelp Date: Mon, 23 Aug 2004 11:04:42 +0530 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 In-Reply-To: <00a201c488d0$f2c972c0$4801a8c0@sprosys.com> X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi Santosh Hold on I's monday and I am on running off the Schedule with my Job... will reply u some time in noon......... Karthik -----Original Message----- From: Santosh [mailto:santosh.s@softprosys.com] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? ----- Original Message ----- From: "Santosh" To: "Lucene Users List" Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp > hi karthik, > I kept log4j in the classpath , I am sending classpath variable > > CLASSPATH > > .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien > t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1. > 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd > k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat > 4.0\common\lib\servlet.jar;C:\Program > Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1. > 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar > ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1. > 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C > :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6. > jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6. > 6\external\log4j.jar > > please check the error > > > > ----- Original Message ----- > From: "Karthik N S" > To: "Lucene Users List" > Sent: Monday, August 23, 2004 10:26 AM > Subject: RE: pdfboxhelp > > > > Hi Santosh > > > > I think u'r Pdf is using Log4j package ,Try toe set the classpath for > > log4j.jar path. > > > > [ Is it a just a WARNING or an ERROR u are getting. > > > > Send me in u'r Configuration management Let me help u with it.... ; [ > > > > > > Karthik > > > > -----Original Message----- > > From: Santosh [mailto:santosh.s@softprosys.com] > > Sent: Monday, August 23, 2004 10:11 AM > > To: Lucene Users List > > Cc: Ben Litchfield > > Subject: Re: pdfboxhelp > > > > > > hi karthik, > > > > I have downloaded pdfbox and kept pdfjar file in the classpath, but when I > > am typing following command in the command prompt I am getting the error: > > > > D:\setups\searchEngine\PDFBox-0.6.6\src>java org.pdfbox.ExtractText > > C:\test.pdf > > C:\test.txt > > log4j:WARN No appenders could be found for logger > > (org.pdfbox.pdfparser.PDFParse > > r). > > log4j:WARN Please initialize the log4j system properly > > > > why I am getting this error? plz help > > > > > > ----- Original Message ----- > > From: "Karthik N S" > > To: "Lucene Users List" > > Sent: Monday, August 23, 2004 9:21 AM > > Subject: RE: pdfboxhelp > > > > > > > Hi > > > > > > > > > To Begin with try to build Indexes offline [ out of Tomcat > container] > > > and on completing indxexes, feed u'r search with the realpath of the > > offline indexed folder,Start the Tomcat and then use the > > > search on.... As u experiment it out u will be comfortable > withrequirment > > of Indexing /Search...... ; [ > > > > > > Karthik > > > > > > -----Original Message----- > > > From: Santosh [mailto:santosh.s@softprosys.com] > > > Sent: Saturday, August 21, 2004 4:55 PM > > > To: Lucene Users List > > > Subject: Re: pdfboxhelp > > > > > > > > > Yes I did the same. > > > I copied all the classes into classes folder but > > > now when I am building the index using IndexHTML the pdfs are not added > to > > > this index, only text and htmls are added to index. > > > what changes should I do for IndexHTML.java to build index with pdf > > > ----- Original Message ----- > > > From: "Karthik N S" > > > To: "Lucene Users List" > > > Sent: Saturday, August 21, 2004 4:54 PM > > > Subject: RE: pdfboxhelp > > > > > > > > > > Hi > > > > > > > > If u are using the jar file with Web Interface for jsp/servlet dev, > > Place > > > > the jar file in "webapps///lib" > > > > and also correct the Classpath for the present modification. > > > > > > > > 2)create u'r own package and put all u'r java files copy the java > files > > > to > > > > /Web-inf/Classes/ > > > > > > > > > > > > Then use the same......;{ > > > > > > > > > > > > Karthik > > > > > > > > -----Original Message----- > > > > From: Santosh [mailto:santosh.s@softprosys.com] > > > > Sent: Saturday, August 21, 2004 4:31 PM > > > > To: Lucene Users List > > > > Subject: Re: pdfboxhelp > > > > > > > > > > > > thanks Natarajan and karthik, > > > > > > > > I corrected classpath > > > > > > > > but where I should write your code? > > > > should I write your code in IndexHTML.java which comes along with > > lucene > > > or > > > > some other place? > > > > one more thing > > > > I kept pdfbox jar file in the classpath is this enough or I have to > > build > > > > the pdfbox? > > > > > > > > thankyou > > > > ----- Original Message ----- > > > > From: "Natarajan.T" > > > > To: "'Lucene Users List'" > > > > Sent: Saturday, August 21, 2004 3:20 PM > > > > Subject: RE: pdfboxhelp > > > > > > > > > > > > > Hi Santhosh, > > > > > > > > > > Try out this below code.....(pdfbox.jar file must be in your > > classpath) > > > > > > > > > > public String getContent(InputStream reader) throws > > > IOException{PDFParser > > > > parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = > > > null;String > > > > pdftext = "";try{parser = new PDFParser(reader);parser.parse();pdDoc = > > > > parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument > decryptor > > = > > > > new > > > > > DecryptDocument(pdDoc);decryptor.decryptDocument("");}stripper = new > > > > PDFTextStripper();pdftext = stripper.getText(pdDoc); > > > > > > > > > > info = pdDoc.getDocumentInformation();}catch(Exception err) > > > > {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} > > > > > > > > > > Natarajan. > > > > > > > > > > -----Original Message----- > > > > > From: Santosh [mailto:santosh.s@softprosys.com] > > > > > Sent: Saturday, August 21, 2004 3:14 PM > > > > > To: Lucene Users List > > > > > Subject: Re: pdfboxhelp > > > > > > > > > > Hi Don, > > > > > > > > > > your Idea is nice, but whenever I write the following code in > > > > > IndexHTML.java of lucene > > > > > > > > > > > > > > > import org.pdfbox.searchengine.lucene.*; > > > > > > > > > > File pdfFile = new File("/path/to/the/file.pdf"); > > > > > > > > > > // Below returns a parse PDF file in a Lucene Document object. > > > > > Document doc = LucenePDFDocument.getDocument(pdfFile); > > > > > > > > > > Iam getting the following error > > > > > > > > > > package org.pdfbox.searchengine.lucene does not exist > > > > > > > > > > I have downloaded pdfbox source code and kept the jar file in the > > > > > classpath, please help me on this----- Original Message ----- From: > > Don > > > > Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 > > > > PMSubject: Re: pdfboxhelp > > > > > > > > > > > > > > > Here is the super simple code required. > > > > > > > > > > import org.pdfbox.searchengine.lucene.*; > > > > > > > > > > File pdfFile = new File("/path/to/the/file.pdf"); > > > > > > > > > > // Below returns a parse PDF file in a Lucene Document > > object.Document > > > > doc = LucenePDFDocument.getDocument(pdfFile); > > > > > > > > > > Santosh wrote: > > > > > > > > > > exactly, the same is required to me----- Original Message ----- > From: > > > Don > > > > Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 > > > > PMSubject: Re: pdfboxhelp > > > > > > > > > > > > > > > What are your intensions with PDFBox? > > > > > > > > > > You want to use it to index PDF files? > > > > > > > > > > Santosh wrote: > > > > > > > > > > hi, > > > > > > > > > > I have downloaded pdfbox zip. but i am in ambigous state that where > to > > > > > start. how can I check with demo, I dont see any help document with > > this > > > > > download, please help me. > > > > > > > > > > > > > > > regards > > > > > Santosh kumar > > > > > SoftPro Systems > > > > > Hyderabad > > > > > > > > > > > > > > > "The harder you train in peace, the lesser you bleed in war" > > > > > > > > > > -----------------------SOFTPRO > > DISCLAIMER------------------------------ > > > > > > > > > > Information contained in this E-MAIL and any attachments are > > > > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > > > > and 'confidential'. > > > > > > > > > > If you are not an intended or authorised recipient of this E-MAIL or > > > > > have received it in error, You are notified that any use, copying or > > > > > dissemination of the information contained in this E-MAIL in any > > > > > manner whatsoever is strictly prohibited. Please delete it > immediately > > > > > and notify the sender by E-MAIL. > > > > > > > > > > In such a case reading, reproducing, printing or further > dissemination > > > > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > > > > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > > > > hereto is free from computer viruses or other defects. > > > > > > > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > > > > those of the author and are not necessarily those of SOFTPRO > SYSTEMS. > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Don VaillancourtDirector of Software Development > > > > > > > > > > WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email: > > > > donv@web-impact.comweb: http://www.web-impact.com > > > > > > > > > > > > > > > > > > > > This email message is intended only for the addressee(s)and > contains > > > > information that may be confidential and/orcopyright. If you are not > the > > > > intended recipient pleasenotify the sender by reply email and > > immediately > > > > deletethis email. Use, disclosure or reproduction of this emailby > anyone > > > > other than the intended recipient(s) is strictlyprohibited. No > > > > representation is made that this email orany attachments are free of > > > > viruses. Virus scanning isrecommended and is the responsibility of the > > > > recipient. > > > > > > > > > > > > > > > > > > > > -----------------------SOFTPRO > > DISCLAIMER------------------------------ > > > > > > > > > > Information contained in this E-MAIL and any attachments are > > > > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > > > > and 'confidential'. > > > > > > > > > > If you are not an intended or authorised recipient of this E-MAIL or > > > > > have received it in error, You are notified that any use, copying or > > > > > dissemination of the information contained in this E-MAIL in any > > > > > manner whatsoever is strictly prohibited. Please delete it > immediately > > > > > and notify the sender by E-MAIL. > > > > > > > > > > In such a case reading, reproducing, printing or further > dissemination > > > > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > > > > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > > > > hereto is free from computer viruses or other defects. > > > > > > > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > > > > those of the author and are not necessarily those of SOFTPRO > SYSTEMS. > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > ------ > > > > > > > > > > > > > >---------------------------------------------------------------------To > > > > unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor > > > > additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > > -----------------------SOFTPRO > > DISCLAIMER------------------------------ > > > > > > > > > > Information contained in this E-MAIL and any attachments are > > > > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > > > > and 'confidential'. > > > > > > > > > > If you are not an intended or authorised recipient of this E-MAIL or > > > > > have received it in error, You are notified that any use, copying or > > > > > dissemination of the information contained in this E-MAIL in any > > > > > manner whatsoever is strictly prohibited. Please delete it > immediately > > > > > and notify the sender by E-MAIL. > > > > > > > > > > In such a case reading, reproducing, printing or further > dissemination > > > > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > > > > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > > > > hereto is free from computer viruses or other defects. > > > > > > > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > > > > those of the author and are not necessarily those of SOFTPRO > SYSTEMS. > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Don VaillancourtDirector of Software Development > > > > > > > > > > WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email: > > > > donv@web-impact.comweb: http://www.web-impact.com > > > > > > > > > > > > > > > > > > > > This email message is intended only for the addressee(s)and > contains > > > > information that may be confidential and/orcopyright. If you are not > the > > > > intended recipient pleasenotify the sender by reply email and > > immediately > > > > deletethis email. Use, disclosure or reproduction of this emailby > anyone > > > > other than the intended recipient(s) is strictlyprohibited. No > > > > representation is made that this email orany attachments are free of > > > > viruses. Virus scanning isrecommended and is the responsibility of the > > > > recipient. > > > > > > > > > > > > > > > > > > > > -----------------------SOFTPRO > > DISCLAIMER------------------------------ > > > > > > > > > > Information contained in this E-MAIL and any attachments are > > > > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > > > > and 'confidential'. > > > > > > > > > > If you are not an intended or authorised recipient of this E-MAIL or > > > > > have received it in error, You are notified that any use, copying or > > > > > dissemination of the information contained in this E-MAIL in any > > > > > manner whatsoever is strictly prohibited. Please delete it > immediately > > > > > and notify the sender by E-MAIL. > > > > > > > > > > In such a case reading, reproducing, printing or further > dissemination > > > > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > > > > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > > > > hereto is free from computer viruses or other defects. > > > > > > > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > > > > those of the author and are not necessarily those of SOFTPRO > SYSTEMS. > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > ------ > > > > > > > > > > > > > >---------------------------------------------------------------------To > > > > unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor > > > > additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > > -----------------------SOFTPRO > > DISCLAIMER------------------------------ > > > > > > > > > > Information contained in this E-MAIL and any attachments are > > > > > confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' > > > > > and 'confidential'. > > > > > > > > > > If you are not an intended or authorised recipient of this E-MAIL or > > > > > have received it in error, You are notified that any use, copying or > > > > > dissemination of the information contained in this E-MAIL in any > > > > > manner whatsoever is strictly prohibited. Please delete it > immediately > > > > > and notify the sender by E-MAIL. > > > > > > > > > > In such a case reading, reproducing, printing or further > dissemination > > > > > of this E-MAIL is strictly prohibited and may be unlawful. > > > > > > > > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment > > > > > hereto is free from computer viruses or other defects. > > > > > > > > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be > > > > > those of the author and are not necessarily those of SOFTPRO > SYSTEMS. > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org