pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Sheppard <matthew.shepp...@gmail.com>
Subject Accessing "alternate text" for an image via PDFBox?
Date Fri, 21 Sep 2012 07:57:41 GMT
Is there some way to extract "alternate text" for a specific image using PDFBox?

I have a PDF file which, as described at
http://www.w3.org/WAI/GL/2011/WD-WCAG20-TECHS-20110621/pdf.html#PDF1,
has had alternate text added to an image. Using PDFBox I can find my
way through the object model to the image itself (a PDXObjectImage)
through PDFDocument.getDocumentCatalog().getAllPages() [iterator]
.getResources.getImages() but I can not see any way to get from the
image itself to the alternate text for it.

A small sample PDF (with a single image which has some alternate text
specified) can be found at
http://dl.dropbox.com/u/12253279/image_test_pass.pdf

Many thanks in advance to anyone who is able to point me in the right direction,
Matt Sheppard

Mime
View raw message