pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Romina O. Leon" <rominaleo...@gmail.com>
Subject Migration to PDFBox 2.0.0
Date Tue, 12 Jan 2016 17:35:42 GMT
Hi! Great library! Thank you so much :)

I'm migrating my application from PDFBox 1.8.10 to 2.0.0, and I'm trying to
get the text (String) of a page, in your website you quote:

Parsing the Page Content

Getting the content for a page has been simplified.

Prior to PDFBox 2.0 parsing the page content was done using

PDStream contents = page.getContents();PDFStreamParser parser = new
PDFStreamParser(contents.getStream());parser.parse();List<Object>
tokens = parser.getTokens();

But, the method getContents() from the PDPage Class returns an InputStream,
which it can't be cast to a PDStream.

And with the example below:

With PDFBox 2.0 the code is reduced to

PDFStreamParser parser = new
PDFStreamParser(page);parser.parse();List<Object> tokens =
parser.getTokens();

I still can't get the page content!

I will apreciate your help!
Thanks!

-- 
Romina Alejandra Osorio León
Teléfono:     (0412) 0905791
E-mail:         rominaleon.7@gmail.com
[image: https://ve.linkedin.com/in/rominaoleon]
<https://ve.linkedin.com/in/rominaoleon>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message