pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Systema Sephiroticum <fallen.tab...@gmail.com>
Subject Casting confusion
Date Thu, 05 Jan 2017 19:57:10 GMT
I've been tasked with traversing a PDF files for embedded links, i.e.
anchors added in winword and what have you. Due to different ways URLs can
be embedded in PDFs, and the variety of nesting levels and ordering, it's
proving a challenge.

I can see in my IDE the URL that I need to acquire is nested deep in the
COSBase's basObject property, but baseObject is private, so I can't
toString() it. To make matters worse, toString() on the COSObject itself
returns me only the top level object rather than the entire contents--if
there were a to get the stringified baseObject's contents, I would be done
with this by now.

My example method is a bit long for email so I put it on pastebin. It's
compiles, but is borderline-pseudocode so there's no need to look at it
unless the rest of this email is unclear.
http://pastebin.com/LvXu0tNh

I'm curious what to do after getting an object from an array as such:
COSObject obj = (COSObject) cosArr.get(i);

At this point I can see the obj's baseObject, like I said, but to acquire
it I have to attempt all sorts of casts to COSDictionary, catch the
ClassCastExceptions and try again, with nullchecks on retrieved objects at
every step. Am I doing this wrong? I haven't been able to find a pattern
for this that isn't ugly as sin. If anyone could point me to examples of
how COSObject.getItem() is supposed to be used in different situations,
that would be great.

Finally, assuming that there really is no way to avoid all this casting,
are there any casts with PDFBox that I can be assured will not result in a
ClassCastException? Specifically, casts to COSArray and COSObject.

Sorry for the haphazard nature of this email, I'm still trying to figure
out exactly all the things about this that I don't understand.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message