lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suhas Indra <>
Subject PDF Text extraction
Date Fri, 27 Dec 2002 06:34:11 GMT
Hello List

I am using PDFBox to index some of the PDF documents. The parser works fine
and I can read the summary. But the contents are displayed as

When I try the following:
System.out.println(doc.getField("contents")) (where doc is the Document

The result will be:


I want to print the extracted data.

Can anyone please let me know how to extract the contents?



Robosoft Technologies - Partners in Product Development

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message