lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christiaan Fluit <Christiaan.Fl...@aduna.biz>
Subject Re: PDFBox problem.
Date Fri, 23 Jul 2004 15:08:29 GMT
We invoke the following code in a static initializer that simply 
disables log4j's output entirely.

	static {
		Properties props = new Properties();
		props.put("log4j.threshold", "OFF");
		org.apache.log4j.PropertyConfigurator.configure(props);
	}

Of course, when you make use of log4j in your own code, you have to be 
more specific.


Regards,

Chris.
--

Natarajan.T wrote:

> FYI,
>  
> I am using PDFBox.jar  to Convert PDF to Text.
>  
> Problem is in the runtime its printing lot of object messages
>  
> How can I avoid this one??? How can I go with this one. 
>  
> import java.io.InputStream;
> import java.io.BufferedWriter;
> import java.io.IOException;
>  
> import org.pdfbox.util.PDFTextStripper;
> import org.pdfbox.pdfparser.PDFParser;
> import org.pdfbox.pdmodel.PDDocument;
> import org.pdfbox.pdmodel.PDDocumentInformation;
>  
>  
> /**
>  * @author natarajant
>  *
>  * TODO To change the template for this generated type comment go to
>  * Window - Preferences - Java - Code Generation - Code and Comments  */
> public class PDFConverter extends DocumentConverter{
>  
>       public PDFConverter() {
>       }
>  
>        /**
>         * This method will construct the Lucene document object from the
>         * given information by extracting the text from PDF file.
>         *
>         * @param              reader and writer - InputStream
> and BufferedWriter
>         * @return             true or false i.e. extract the
> text or not
>         */
>         public boolean extractText(InputStream  reader, BufferedWriter
> writer) throws IOException{
>  
>              PDFParser parser = null;
>              PDDocument pdDoc = null;
>              PDFTextStripper stripper = null;
>              String pdftext = "";
>              String pdftitle = "";
>              try {
>              parser = new PDFParser(reader);
>                    parser.parse();
>                    pdDoc = parser.getPDDocument();
>  
>                    stripper = new PDFTextStripper();
>                    pdftext = stripper.getText(pdDoc);
>  
>                    writer.write(pdftext +" ");
>  
>              PDDocumentInformation info =
> pdDoc.getDocumentInformation();
>                    pdftitle = info.getTitle();
>  
>        } catch(Exception err) {
>  
>                    System.out.println(err.getMessage());
>             }
>             writer.close();
>             return true;
>        }
>  
>  
> }
>  
> 


-- 
christiaan.fluit@aduna.biz

Aduna
Prinses Julianaplein 14-b
3817 CS Amersfoort
The Netherlands

+31 33 465 9987 phone
+31 33 465 9987 fax

http://aduna.biz


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message