pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: PDF to Image conversion
Date Thu, 06 Jun 2013 14:52:40 GMT
Hi Ankur,

due to limitations of the mailing list the attachment didn't make it - could you upload to
a public site?


Am 06.06.2013 um 15:57 schrieb Ankur Tripathi <ankur@intelligrape.com>:

> Hi Maruan,
> I have successfully converted pdf to images using pdfbox. However there are couple of
attached pdf which does not shows up filled in details. 
> In my use case i need to merge different pdf's and than convert to series of images.
Since these pdf are not converted to image properly i would like to avoid merge or fix image
generation. Can you please point me to right direction.
> Converting this pdf with imagemagick works fine but i would like to avoid os level tool.
> Really appreciate your help
> Thanks
> -Ankur Tripathi
> On Thu, Jun 6, 2013 at 4:01 PM, Maruan Sahyoun <sahyoun@fileaffairs.de> wrote:
>> Hi,
>> the question if a PDF can be rendered successfully is not dependent on the fact that
it's a PDF/A file. In general PDFBox does a good job in converting a PDF to image and it supports
PDF/A as well as not PDF/A compliant files. There are some limitations though which may or
may not apply to you.
>> - there are limitations in PDFs render mode i.e. not all possible render modes are
>> - there are limitations in PDFs shading i.e. not all shadings are supported
>> - font rendering is dependent on awt i.e. PDFBox generates a font from embedded font
which is passed to awt. This works in most but not all cases.
>> - always test if e.g. Adobe Reader can display the file
>> ….
>> If while rendering PDFBox hit's an (yet) unsupported feature it will be reported.
If you come across such a limitation please log an enhancement request in Jira (you should
search first if someone else already had a similar issue and add to that) so we can look into
removing the limitation. Of course if you are able to contribute that's even better.
>> For PDF/A (PDF/A-1b) PDFBox passes several test suites. So if you think you have
a valid PDF/A file and PDFBox complains we are very interested in finding out why this is
the case. But it's very likely that your file might not be PDF/A-1b compliant.
>> If you have specific questions/issues please feel free to ask.
>> Bottom line - I think PDFBox will help you doing the conversion. Your milage will
vary dependent on the files content.
>> BR
>> Maruan Sahyoun
>> Am 06.06.2013 um 12:12 schrieb Ankur Tripathi <ankur@intelligrape.com>:
>> > Hi,
>> >
>> > I have a use case in my project where i want to convert every page of pdf
>> > to image. I have tried different opensource libraries like pdfrenderere,
>> > open source version of jpedal etc but with each of them we have problems
>> > with PDF/A pdf's. Before trying pdfbox i would like to know that if support
>> > for embedded font and pdf/a is available in pdfbox api. If not is there any
>> > way to identify if a particular pdf can not be converted to image,  I
>> > already tried http://pdfbox.apache.org/cookbook/pdfavalidation.html to
>> > validate uploaded pdf's but it fails for all of our pdfs but we are able to
>> > convert them into image properly. There are only few formed filled pdf
>> > which have not been converted to image properly.
>> >
>> > Thanks for help.
>> >
>> >
>> > Thanks
>> > -Ankur Tripathi

  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message