lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Litchfield <...@csh.rit.edu>
Subject Re: PDFBox problem.
Date Fri, 23 Jul 2004 15:14:53 GMT


I usually use use -Dlog4j.configuration=log4j.xml when invoking java from
the command line, but I believe this depends on your environment.

ex

java -Dlog4j.configuration=log4j.xml org.pdfbox.ExtractText input.pdf

Ben



On Fri, 23 Jul 2004, Christiaan Fluit wrote:

> We invoke the following code in a static initializer that simply
> disables log4j's output entirely.
>
> 	static {
> 		Properties props = new Properties();
> 		props.put("log4j.threshold", "OFF");
> 		org.apache.log4j.PropertyConfigurator.configure(props);
> 	}
>
> Of course, when you make use of log4j in your own code, you have to be
> more specific.
>
>
> Regards,
>
> Chris.
> --
>
> Natarajan.T wrote:
>
> > FYI,
> >
> > I am using PDFBox.jar  to Convert PDF to Text.
> >
> > Problem is in the runtime its printing lot of object messages
> >
> > How can I avoid this one??? How can I go with this one.
> >
> > import java.io.InputStream;
> > import java.io.BufferedWriter;
> > import java.io.IOException;
> >
> > import org.pdfbox.util.PDFTextStripper;
> > import org.pdfbox.pdfparser.PDFParser;
> > import org.pdfbox.pdmodel.PDDocument;
> > import org.pdfbox.pdmodel.PDDocumentInformation;
> >
> >
> > /**
> >  * @author natarajant
> >  *
> >  * TODO To change the template for this generated type comment go to
> >  * Window - Preferences - Java - Code Generation - Code and Comments  */
> > public class PDFConverter extends DocumentConverter{
> >
> >       public PDFConverter() {
> >       }
> >
> >        /**
> >         * This method will construct the Lucene document object from the
> >         * given information by extracting the text from PDF file.
> >         *
> >         * @param              reader and writer - InputStream
> > and BufferedWriter
> >         * @return             true or false i.e. extract the
> > text or not
> >         */
> >         public boolean extractText(InputStream  reader, BufferedWriter
> > writer) throws IOException{
> >
> >              PDFParser parser = null;
> >              PDDocument pdDoc = null;
> >              PDFTextStripper stripper = null;
> >              String pdftext = "";
> >              String pdftitle = "";
> >              try {
> >              parser = new PDFParser(reader);
> >                    parser.parse();
> >                    pdDoc = parser.getPDDocument();
> >
> >                    stripper = new PDFTextStripper();
> >                    pdftext = stripper.getText(pdDoc);
> >
> >                    writer.write(pdftext +" ");
> >
> >              PDDocumentInformation info =
> > pdDoc.getDocumentInformation();
> >                    pdftitle = info.getTitle();
> >
> >        } catch(Exception err) {
> >
> >                    System.out.println(err.getMessage());
> >             }
> >             writer.close();
> >             return true;
> >        }
> >
> >
> > }
> >
> >
>
>
> --
> christiaan.fluit@aduna.biz
>
> Aduna
> Prinses Julianaplein 14-b
> 3817 CS Amersfoort
> The Netherlands
>
> +31 33 465 9987 phone
> +31 33 465 9987 fax
>
> http://aduna.biz
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message