# lucene-java-user mailing list archives

##### Site index · List index
Message view
Top
From "Santosh" <santos...@softprosys.com>
Subject Re: pdfboxhelp
Date Mon, 23 Aug 2004 04:53:06 GMT
```hi karthik,
I kept log4j in the classpath , I am sending classpath variable

CLASSPATH

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.
4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\Program
Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.
4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar
;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.
4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C
:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.
jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
6\external\log4j.jar

----- Original Message -----
From: "Karthik N S" <karthik@controlnet.co.in>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Monday, August 23, 2004 10:26 AM
Subject: RE: pdfboxhelp

> Hi Santosh
>
>   I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
> log4j.jar path.
>
>  [ Is it a just a WARNING  or an ERROR  u are getting.
>
>   Send me in u'r Configuration management Let me help u with it.... ; [
>
>
> Karthik
>
> -----Original Message-----
> From: Santosh [mailto:santosh.s@softprosys.com]
> Sent: Monday, August 23, 2004 10:11 AM
> To: Lucene Users List
> Cc: Ben Litchfield
> Subject: Re: pdfboxhelp
>
>
> hi karthik,
>
> I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
> am typing following command in the command prompt I am getting the error:
>
> D:\setups\searchEngine\PDFBox-0.6.6\src>java org.pdfbox.ExtractText
> C:\test.pdf
> C:\test.txt
> log4j:WARN No appenders could be found for logger
> (org.pdfbox.pdfparser.PDFParse
> r).
> log4j:WARN Please initialize the log4j system properly
>
> why I am getting this error? plz help
>
>
> ----- Original Message -----
> From: "Karthik N S" <karthik@controlnet.co.in>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Monday, August 23, 2004 9:21 AM
> Subject: RE: pdfboxhelp
>
>
> > Hi
> >
> >
> >     To Begin with try to build Indexes offline  [ out of Tomcat
container]
> > and  on completing indxexes, feed u'r search  with the realpath of the
> offline indexed folder,Start the Tomcat and then use the
> > search on.... As u experiment it out u will be comfortable
withrequirment
> of Indexing /Search......       ; [
> >
> > Karthik
> >
> > -----Original Message-----
> > From: Santosh [mailto:santosh.s@softprosys.com]
> > Sent: Saturday, August 21, 2004 4:55 PM
> > To: Lucene Users List
> > Subject: Re: pdfboxhelp
> >
> >
> > Yes I did the same.
> > I copied all the classes into classes folder but
> > now when I am building the index using IndexHTML the pdfs are not added
to
> > this index, only text and htmls are added to index.
> > what changes should I do for IndexHTML.java to build index with pdf
> > ----- Original Message -----
> > From: "Karthik N S" <karthik@controlnet.co.in>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Saturday, August 21, 2004 4:54 PM
> > Subject: RE: pdfboxhelp
> >
> >
> > > Hi
> > >
> > > If u are using the jar file with Web Interface for jsp/servlet dev,
> Place
> > > the jar file in  "webapps/<u'rapplication>/<Web-inf>/lib"
> > > and also correct the Classpath for the present modification.
> > >
> > > 2)create u'r own package and put all u'r java files  copy the java
files
> > to
> > > /Web-inf/Classes/<u'r package>
> > >
> > >
> > > Then use the same......;{
> > >
> > >
> > > Karthik
> > >
> > > -----Original Message-----
> > > From: Santosh [mailto:santosh.s@softprosys.com]
> > > Sent: Saturday, August 21, 2004 4:31 PM
> > > To: Lucene Users List
> > > Subject: Re: pdfboxhelp
> > >
> > >
> > > thanks  Natarajan and karthik,
> > >
> > > I corrected classpath
> > >
> > > but where I should write your code?
> > > should I write your code in IndexHTML.java  which comes along with
> lucene
> > or
> > > some other place?
> > > one more thing
> > > I kept pdfbox jar file in the classpath is this enough or I have to
> build
> > > the pdfbox?
> > >
> > > thankyou
> > > ----- Original Message -----
> > > From: "Natarajan.T" <natarajant@crimsonlogic.co.in>
> > > To: "'Lucene Users List'" <lucene-user@jakarta.apache.org>
> > > Sent: Saturday, August 21, 2004 3:20 PM
> > > Subject: RE: pdfboxhelp
> > >
> > >
> > > > Hi Santhosh,
> > > >
> > > > Try out this below code.....(pdfbox.jar file must be in your
> classpath)
> > > >
> > > > public String getContent(InputStream  reader) throws
> > IOException{PDFParser
> > > parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
> > null;String
> > > pdftext = "";try{parser = new PDFParser(reader);parser.parse();pdDoc =
> > > parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument
decryptor
> =
> > > new
> > > > DecryptDocument(pdDoc);decryptor.decryptDocument("");}stripper = new
> > > PDFTextStripper();pdftext = stripper.getText(pdDoc);
> > > >
> > > >        info = pdDoc.getDocumentInformation();}catch(Exception err)
> > > {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
> > > >
> > > > Natarajan.
> > > >
> > > > -----Original Message-----
> > > > From: Santosh [mailto:santosh.s@softprosys.com]
> > > > Sent: Saturday, August 21, 2004 3:14 PM
> > > > To: Lucene Users List
> > > > Subject: Re: pdfboxhelp
> > > >
> > > > Hi Don,
> > > >
> > > > your Idea is nice, but whenever I write the  following code in
> > > > IndexHTML.java of lucene
> > > >
> > > >
> > > > import org.pdfbox.searchengine.lucene.*;
> > > >
> > > > File pdfFile = new File("/path/to/the/file.pdf");
> > > >
> > > > // Below returns a parse PDF file in a Lucene Document object.
> > > > Document doc = LucenePDFDocument.getDocument(pdfFile);
> > > >
> > > > Iam getting the following error
> > > >
> > > > package org.pdfbox.searchengine.lucene does not exist
> > > >
> > > > I have downloaded pdfbox source code and kept the jar file in the
> Don
> > > Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
> > > PMSubject: Re: pdfboxhelp
> > > >
> > > >
> > > >   Here is the super simple code required.
> > > >
> > > >   import org.pdfbox.searchengine.lucene.*;
> > > >
> > > >   File pdfFile = new File("/path/to/the/file.pdf");
> > > >
> > > >   // Below returns a parse PDF file in a Lucene Document
> object.Document
> > > doc = LucenePDFDocument.getDocument(pdfFile);
> > > >
> > > >                   Santosh wrote:
> > > >
> > > > exactly, the same is required to me----- Original Message -----
From:
> > Don
> > > Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
> > > PMSubject: Re: pdfboxhelp
> > > >
> > > >
> > > >   What are your intensions with PDFBox?
> > > >
> > > >   You want to use it to index PDF files?
> > > >
> > > >   Santosh wrote:
> > > >
> > > > hi,
> > > >
> > > > I have downloaded pdfbox zip. but i am in ambigous state that where
to
> > > > start. how can I check with demo, I dont see any help document with
> this
> > > >
> > > >
> > > > regards
> > > > Santosh kumar
> > > > SoftPro Systems
> > > >
> > > >
> > > > "The harder you train in peace, the lesser you bleed in war"
> > > >
> > > > -----------------------SOFTPRO
> DISCLAIMER------------------------------
> > > >
> > > > Information contained in this E-MAIL and any attachments are
> > > > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > > > and 'confidential'.
> > > >
> > > > If you are not an intended or authorised recipient of this E-MAIL or
> > > > have received it in error, You are notified that any use, copying or
> > > > dissemination  of the information contained in this E-MAIL in any
> > > > manner whatsoever is strictly prohibited. Please delete it
immediately
> > > > and notify the sender by E-MAIL.
> > > >
> > > > In such a case reading, reproducing, printing or further
dissemination
> > > > of this E-MAIL is strictly prohibited and may be unlawful.
> > > >
> > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > > > hereto is free from computer viruses or other defects.
> > > >
> > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > > > those of the author and are not necessarily those of SOFTPRO
SYSTEMS.
> > >
> > ------------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >   -- Don VaillancourtDirector of Software Development
> > > >
> > > >   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
> > > donv@web-impact.comweb: http://www.web-impact.com
> > > >
> > > >
> > > >
> > > >   This email message is intended only for the addressee(s)and
contains
> > > information that may be confidential and/orcopyright. If you are not
the
> > > intended recipient pleasenotify the sender by reply email and
> immediately
> > > deletethis email. Use, disclosure or reproduction of this emailby
anyone
> > > other than the intended recipient(s) is strictlyprohibited. No
> > > representation is made that this email orany attachments are free of
> > > viruses. Virus scanning isrecommended and is the responsibility of the
> > > recipient.
> > > >
> > > >
> > > >
> > > > -----------------------SOFTPRO
> DISCLAIMER------------------------------
> > > >
> > > > Information contained in this E-MAIL and any attachments are
> > > > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > > > and 'confidential'.
> > > >
> > > > If you are not an intended or authorised recipient of this E-MAIL or
> > > > have received it in error, You are notified that any use, copying or
> > > > dissemination  of the information contained in this E-MAIL in any
> > > > manner whatsoever is strictly prohibited. Please delete it
immediately
> > > > and notify the sender by E-MAIL.
> > > >
> > > > In such a case reading, reproducing, printing or further
dissemination
> > > > of this E-MAIL is strictly prohibited and may be unlawful.
> > > >
> > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > > > hereto is free from computer viruses or other defects.
> > > >
> > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > > > those of the author and are not necessarily those of SOFTPRO
SYSTEMS.
> > >
> > ------------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > ------------------------------------------------------------------------
> > > > ------
> > > >
> > > >
> > >---------------------------------------------------------------------To
> > > unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor
> > > additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > > > -----------------------SOFTPRO
> DISCLAIMER------------------------------
> > > >
> > > > Information contained in this E-MAIL and any attachments are
> > > > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > > > and 'confidential'.
> > > >
> > > > If you are not an intended or authorised recipient of this E-MAIL or
> > > > have received it in error, You are notified that any use, copying or
> > > > dissemination  of the information contained in this E-MAIL in any
> > > > manner whatsoever is strictly prohibited. Please delete it
immediately
> > > > and notify the sender by E-MAIL.
> > > >
> > > > In such a case reading, reproducing, printing or further
dissemination
> > > > of this E-MAIL is strictly prohibited and may be unlawful.
> > > >
> > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > > > hereto is free from computer viruses or other defects.
> > > >
> > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > > > those of the author and are not necessarily those of SOFTPRO
SYSTEMS.
> > >
> > ------------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >   -- Don VaillancourtDirector of Software Development
> > > >
> > > >   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
> > > donv@web-impact.comweb: http://www.web-impact.com
> > > >
> > > >
> > > >
> > > >   This email message is intended only for the addressee(s)and
contains
> > > information that may be confidential and/orcopyright. If you are not
the
> > > intended recipient pleasenotify the sender by reply email and
> immediately
> > > deletethis email. Use, disclosure or reproduction of this emailby
anyone
> > > other than the intended recipient(s) is strictlyprohibited. No
> > > representation is made that this email orany attachments are free of
> > > viruses. Virus scanning isrecommended and is the responsibility of the
> > > recipient.
> > > >
> > > >
> > > >
> > > > -----------------------SOFTPRO
> DISCLAIMER------------------------------
> > > >
> > > > Information contained in this E-MAIL and any attachments are
> > > > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > > > and 'confidential'.
> > > >
> > > > If you are not an intended or authorised recipient of this E-MAIL or
> > > > have received it in error, You are notified that any use, copying or
> > > > dissemination  of the information contained in this E-MAIL in any
> > > > manner whatsoever is strictly prohibited. Please delete it
immediately
> > > > and notify the sender by E-MAIL.
> > > >
> > > > In such a case reading, reproducing, printing or further
dissemination
> > > > of this E-MAIL is strictly prohibited and may be unlawful.
> > > >
> > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > > > hereto is free from computer viruses or other defects.
> > > >
> > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > > > those of the author and are not necessarily those of SOFTPRO
SYSTEMS.
> > >
> > ------------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > ------------------------------------------------------------------------
> > > > ------
> > > >
> > > >
> > >---------------------------------------------------------------------To
> > > unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor
> > > additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > > > -----------------------SOFTPRO
> DISCLAIMER------------------------------
> > > >
> > > > Information contained in this E-MAIL and any attachments are
> > > > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > > > and 'confidential'.
> > > >
> > > > If you are not an intended or authorised recipient of this E-MAIL or
> > > > have received it in error, You are notified that any use, copying or
> > > > dissemination  of the information contained in this E-MAIL in any
> > > > manner whatsoever is strictly prohibited. Please delete it
immediately
> > > > and notify the sender by E-MAIL.
> > > >
> > > > In such a case reading, reproducing, printing or further
dissemination
> > > > of this E-MAIL is strictly prohibited and may be unlawful.
> > > >
> > > > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > > > hereto is free from computer viruses or other defects.
> > > >
> > > > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > > > those of the author and are not necessarily those of SOFTPRO
SYSTEMS.
> > >
> > ------------------------------------------------------------------------
> > > >
> > > >
> > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org