pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: PDF to Image conversion
Date Thu, 06 Jun 2013 16:35:53 GMT
Hi Ankur,

unfortunately there are some issues with rendering these files. It needs some further analysis
to find out why that happens.

Maruan Sahyoun

Am 06.06.2013 um 17:05 schrieb Ankur Tripathi <ankur@intelligrape.com>:

> Maruan,
> I had added pdf's to dropbox you can download it from here
> https://www.dropbox.com/sh/ruwz0l5hya2l4tp/9-zLVv96uw
> Let me know if you cannot access this link.
> On Thu, Jun 6, 2013 at 8:22 PM, Maruan Sahyoun <sahyoun@fileaffairs.de>wrote:
>> Hi Ankur,
>> due to limitations of the mailing list the attachment didn't make it -
>> could you upload to a public site?
>> BR
>> Maruan
>> Am 06.06.2013 um 15:57 schrieb Ankur Tripathi <ankur@intelligrape.com>:
>>> Hi Maruan,
>>> I have successfully converted pdf to images using pdfbox. However there
>> are couple of attached pdf which does not shows up filled in details.
>>> In my use case i need to merge different pdf's and than convert to
>> series of images. Since these pdf are not converted to image properly i
>> would like to avoid merge or fix image generation. Can you please point me
>> to right direction.
>>> Converting this pdf with imagemagick works fine but i would like to
>> avoid os level tool.
>>> Really appreciate your help
>>> Thanks
>>> -Ankur Tripathi
>>> On Thu, Jun 6, 2013 at 4:01 PM, Maruan Sahyoun <sahyoun@fileaffairs.de>
>> wrote:
>>>> Hi,
>>>> the question if a PDF can be rendered successfully is not dependent on
>> the fact that it's a PDF/A file. In general PDFBox does a good job in
>> converting a PDF to image and it supports PDF/A as well as not PDF/A
>> compliant files. There are some limitations though which may or may not
>> apply to you.
>>>> - there are limitations in PDFs render mode i.e. not all possible
>> render modes are supported
>>>> - there are limitations in PDFs shading i.e. not all shadings are
>> supported
>>>> - font rendering is dependent on awt i.e. PDFBox generates a font from
>> embedded font which is passed to awt. This works in most but not all cases.
>>>> - always test if e.g. Adobe Reader can display the file
>>>> ….
>>>> If while rendering PDFBox hit's an (yet) unsupported feature it will be
>> reported. If you come across such a limitation please log an enhancement
>> request in Jira (you should search first if someone else already had a
>> similar issue and add to that) so we can look into removing the limitation.
>> Of course if you are able to contribute that's even better.
>>>> For PDF/A (PDF/A-1b) PDFBox passes several test suites. So if you think
>> you have a valid PDF/A file and PDFBox complains we are very interested in
>> finding out why this is the case. But it's very likely that your file might
>> not be PDF/A-1b compliant.
>>>> If you have specific questions/issues please feel free to ask.
>>>> Bottom line - I think PDFBox will help you doing the conversion. Your
>> milage will vary dependent on the files content.
>>>> BR
>>>> Maruan Sahyoun
>>>> Am 06.06.2013 um 12:12 schrieb Ankur Tripathi <ankur@intelligrape.com>:
>>>>> Hi,
>>>>> I have a use case in my project where i want to convert every page of
>> pdf
>>>>> to image. I have tried different opensource libraries like
>> pdfrenderere,
>>>>> open source version of jpedal etc but with each of them we have
>> problems
>>>>> with PDF/A pdf's. Before trying pdfbox i would like to know that if
>> support
>>>>> for embedded font and pdf/a is available in pdfbox api. If not is
>> there any
>>>>> way to identify if a particular pdf can not be converted to image,  I
>>>>> already tried http://pdfbox.apache.org/cookbook/pdfavalidation.htmlto
>>>>> validate uploaded pdf's but it fails for all of our pdfs but we are
>> able to
>>>>> convert them into image properly. There are only few formed filled pdf
>>>>> which have not been converted to image properly.
>>>>> Thanks for help.
>>>>> Thanks
>>>>> -Ankur Tripathi

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message