lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik N S" <kart...@controlnet.co.in>
Subject RE: pdfboxhelp
Date Mon, 23 Aug 2004 03:51:40 GMT
Hi


    To Begin with try to build Indexes offline  [ out of Tomcat container]
and  on completing indxexes, feed u'r search  with the real
    path of the  offline indexed folder,Start the Tomcat and then use the
search on.... As u experiment it out u will be comfortable with
    requirment of Indexing /Search......       ; [

Karthik

-----Original Message-----
From: Santosh [mailto:santosh.s@softprosys.com]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp


Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not added to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with pdf
----- Original Message -----
From: "Karthik N S" <karthik@controlnet.co.in>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Saturday, August 21, 2004 4:54 PM
Subject: RE: pdfboxhelp


> Hi
>
> If u are using the jar file with Web Interface for jsp/servlet dev, Place
> the jar file in  "webapps/<u'rapplication>/<Web-inf>/lib"
> and also correct the Classpath for the present modification.
>
> 2)create u'r own package and put all u'r java files  copy the java files
to
> /Web-inf/Classes/<u'r package>
>
>
> Then use the same......;{
>
>
> Karthik
>
> -----Original Message-----
> From: Santosh [mailto:santosh.s@softprosys.com]
> Sent: Saturday, August 21, 2004 4:31 PM
> To: Lucene Users List
> Subject: Re: pdfboxhelp
>
>
> thanks  Natarajan and karthik,
>
> I corrected classpath
>
> but where I should write your code?
> should I write your code in IndexHTML.java  which comes along with lucene
or
> some other place?
> one more thing
> I kept pdfbox jar file in the classpath is this enough or I have to build
> the pdfbox?
>
> thankyou
> ----- Original Message -----
> From: "Natarajan.T" <natarajant@crimsonlogic.co.in>
> To: "'Lucene Users List'" <lucene-user@jakarta.apache.org>
> Sent: Saturday, August 21, 2004 3:20 PM
> Subject: RE: pdfboxhelp
>
>
> > Hi Santhosh,
> >
> > Try out this below code.....(pdfbox.jar file must be in your classpath)
> >
> > public String getContent(InputStream  reader) throws
IOException{PDFParser
> parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
null;String
> pdftext = "";try{parser = new PDFParser(reader);parser.parse();pdDoc =
> parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor =
> new
> > DecryptDocument(pdDoc);decryptor.decryptDocument("");}stripper = new
> PDFTextStripper();pdftext = stripper.getText(pdDoc);
> >
> >        info = pdDoc.getDocumentInformation();}catch(Exception err)
> {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
> >
> > Natarajan.
> >
> > -----Original Message-----
> > From: Santosh [mailto:santosh.s@softprosys.com]
> > Sent: Saturday, August 21, 2004 3:14 PM
> > To: Lucene Users List
> > Subject: Re: pdfboxhelp
> >
> > Hi Don,
> >
> > your Idea is nice, but whenever I write the  following code in
> > IndexHTML.java of lucene
> >
> >
> > import org.pdfbox.searchengine.lucene.*;
> >
> > File pdfFile = new File("/path/to/the/file.pdf");
> >
> > // Below returns a parse PDF file in a Lucene Document object.
> > Document doc = LucenePDFDocument.getDocument(pdfFile);
> >
> > Iam getting the following error
> >
> > package org.pdfbox.searchengine.lucene does not exist
> >
> > I have downloaded pdfbox source code and kept the jar file in the
> > classpath, please help me on this----- Original Message ----- From: Don
> Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
> PMSubject: Re: pdfboxhelp
> >
> >
> >   Here is the super simple code required.
> >
> >   import org.pdfbox.searchengine.lucene.*;
> >
> >   File pdfFile = new File("/path/to/the/file.pdf");
> >
> >   // Below returns a parse PDF file in a Lucene Document object.Document
> doc = LucenePDFDocument.getDocument(pdfFile);
> >
> >                   Santosh wrote:
> >
> > exactly, the same is required to me----- Original Message ----- From:
Don
> Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
> PMSubject: Re: pdfboxhelp
> >
> >
> >   What are your intensions with PDFBox?
> >
> >   You want to use it to index PDF files?
> >
> >   Santosh wrote:
> >
> > hi,
> >
> > I have downloaded pdfbox zip. but i am in ambigous state that where to
> > start. how can I check with demo, I dont see any help document with this
> > download, please help me.
> >
> >
> > regards
> > Santosh kumar
> > SoftPro Systems
> > Hyderabad
> >
> >
> > "The harder you train in peace, the lesser you bleed in war"
> >
> > -----------------------SOFTPRO DISCLAIMER------------------------------
> >
> > Information contained in this E-MAIL and any attachments are
> > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > and 'confidential'.
> >
> > If you are not an intended or authorised recipient of this E-MAIL or
> > have received it in error, You are notified that any use, copying or
> > dissemination  of the information contained in this E-MAIL in any
> > manner whatsoever is strictly prohibited. Please delete it immediately
> > and notify the sender by E-MAIL.
> >
> > In such a case reading, reproducing, printing or further dissemination
> > of this E-MAIL is strictly prohibited and may be unlawful.
> >
> > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > hereto is free from computer viruses or other defects.
> >
> > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> > ------------------------------------------------------------------------
> >
> >
> >
> >
> >
> >   -- Don VaillancourtDirector of Software Development
> >
> >   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
> donv@web-impact.comweb: http://www.web-impact.com
> >
> >
> >
> >   This email message is intended only for the addressee(s)and contains
> information that may be confidential and/orcopyright. If you are not the
> intended recipient pleasenotify the sender by reply email and immediately
> deletethis email. Use, disclosure or reproduction of this emailby anyone
> other than the intended recipient(s) is strictlyprohibited. No
> representation is made that this email orany attachments are free of
> viruses. Virus scanning isrecommended and is the responsibility of the
> recipient.
> >
> >
> >
> > -----------------------SOFTPRO DISCLAIMER------------------------------
> >
> > Information contained in this E-MAIL and any attachments are
> > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > and 'confidential'.
> >
> > If you are not an intended or authorised recipient of this E-MAIL or
> > have received it in error, You are notified that any use, copying or
> > dissemination  of the information contained in this E-MAIL in any
> > manner whatsoever is strictly prohibited. Please delete it immediately
> > and notify the sender by E-MAIL.
> >
> > In such a case reading, reproducing, printing or further dissemination
> > of this E-MAIL is strictly prohibited and may be unlawful.
> >
> > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > hereto is free from computer viruses or other defects.
> >
> > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> > ------------------------------------------------------------------------
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> > ------
> >
> >
>---------------------------------------------------------------------To
> unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor
> additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> > -----------------------SOFTPRO DISCLAIMER------------------------------
> >
> > Information contained in this E-MAIL and any attachments are
> > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > and 'confidential'.
> >
> > If you are not an intended or authorised recipient of this E-MAIL or
> > have received it in error, You are notified that any use, copying or
> > dissemination  of the information contained in this E-MAIL in any
> > manner whatsoever is strictly prohibited. Please delete it immediately
> > and notify the sender by E-MAIL.
> >
> > In such a case reading, reproducing, printing or further dissemination
> > of this E-MAIL is strictly prohibited and may be unlawful.
> >
> > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > hereto is free from computer viruses or other defects.
> >
> > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> > ------------------------------------------------------------------------
> >
> >
> >
> >
> >
> >   -- Don VaillancourtDirector of Software Development
> >
> >   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
> donv@web-impact.comweb: http://www.web-impact.com
> >
> >
> >
> >   This email message is intended only for the addressee(s)and contains
> information that may be confidential and/orcopyright. If you are not the
> intended recipient pleasenotify the sender by reply email and immediately
> deletethis email. Use, disclosure or reproduction of this emailby anyone
> other than the intended recipient(s) is strictlyprohibited. No
> representation is made that this email orany attachments are free of
> viruses. Virus scanning isrecommended and is the responsibility of the
> recipient.
> >
> >
> >
> > -----------------------SOFTPRO DISCLAIMER------------------------------
> >
> > Information contained in this E-MAIL and any attachments are
> > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > and 'confidential'.
> >
> > If you are not an intended or authorised recipient of this E-MAIL or
> > have received it in error, You are notified that any use, copying or
> > dissemination  of the information contained in this E-MAIL in any
> > manner whatsoever is strictly prohibited. Please delete it immediately
> > and notify the sender by E-MAIL.
> >
> > In such a case reading, reproducing, printing or further dissemination
> > of this E-MAIL is strictly prohibited and may be unlawful.
> >
> > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > hereto is free from computer viruses or other defects.
> >
> > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> > ------------------------------------------------------------------------
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> > ------
> >
> >
>---------------------------------------------------------------------To
> unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.orgFor
> additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> > -----------------------SOFTPRO DISCLAIMER------------------------------
> >
> > Information contained in this E-MAIL and any attachments are
> > confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
> > and 'confidential'.
> >
> > If you are not an intended or authorised recipient of this E-MAIL or
> > have received it in error, You are notified that any use, copying or
> > dissemination  of the information contained in this E-MAIL in any
> > manner whatsoever is strictly prohibited. Please delete it immediately
> > and notify the sender by E-MAIL.
> >
> > In such a case reading, reproducing, printing or further dissemination
> > of this E-MAIL is strictly prohibited and may be unlawful.
> >
> > SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
> > hereto is free from computer viruses or other defects.
> >
> > The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
> > those of the author and are not necessarily those of SOFTPRO SYSTEMS.
> > ------------------------------------------------------------------------
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message