pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serban Alexe <serban.al...@gmail.com>
Subject Convert PDF to HTML with PDFBox in a Java app - Need some introductory info & guidance
Date Thu, 01 Feb 2018 16:14:00 GMT
Hello everybody,

I need to write a Java class that converts a *.pdf* document to the html
format, preferably keeping the original formatting to the best extent
Also, I need to be able to extract the images (and preferably encode them
as base64 in the html file).

*Can you please provide me some useful starting points and/or examples ? *

Through google search, I was able to find some limited functionality
examples. None of these deal with images, and also my guess is that they
refer to some older version of the PDFBox suite...

Thank you,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message