pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Issue with PDF - Image conversion
Date Tue, 11 Jun 2013 12:07:03 GMT
Hi,

Am 10.06.2013 11:15, schrieb Robin Thomas Panicker:
> Thanks a lot Gilad, for responding. I was not sure on what more information
> to provide. Now that you have asked me the specific details, let me provide
> you with more information.
>
> I am using the below code to do the conversion of PDF - image. (Trying to
> save the first page of the pdf as an image file)
>
>   String pdfFile ="d:/hs/4.pdf";
>   document = PDDocument.load( pdfFile );
>
>              List pages = document.getDocumentCatalog().getAllPages();
>              PDPage page = ( PDPage ) pages.get( 0 );
>              int width = ( int ) page.getArtBox().getWidth();
>              int height = ( int ) page.getArtBox().getHeight();
>              BufferedImage image = page.convertToImage( imageType,
> resolution );
>
>
> On a machine (prod server) where the conversion DOES NOT work, I have
> Ubuntu 12.4, open office 3.0
> while on a machine (development machine) where the conversion works, I have
> Ubuntu 10.10 and open office 3.0
>
> On both the machines I am using the same code and version of PDFBox on both
> is 1.8.1
>
> The issue that I face is that the image conversion simply doesnt work
> correctly ( I can see parts of image / text garbled, or missing) There is
> no error or warning on the log outputs.
>
> Please let me know if I can provide you with any more information in
> understanding the problem
Without a sample pdf this is just a guess:

The fact that you are using open office 3.0 leads to the assumption that the pdf
in question contains fonts as embedded subsets. Those are not fully supported
by PDFBox. There are different issues with those kind of fonts.
As you are using different platforms (Ubuntu 10.10 vs 12.04) you are most likely
using different versions of the JDK (1.6 vs 1.7). There are some 1.7 specific
issues with embedded font subsets.

> Thanks,
> Robin
>
>
>
> On Mon, Jun 10, 2013 at 2:25 PM, Gilad Denneboom
> <gilad.denneboom@gmail.com>wrote:
>
>> A lof of information missing, there... How are you converting the PDF
>> files, exactly? What type of problems do you encounter? Which version of
>> PDFBox do you use? And what does it have to do with your Office suite
>>
>> Without more information it's impossible to help you with your problem.
>>
>>
>> On Mon, Jun 10, 2013 at 8:22 AM, Robin Thomas Panicker <robin@qburst.com
>>> wrote:
>>
>>> Hi,
>>>           I am using PDFBox to convert PDF documents into images. However
>> in
>>> some machines I am facing an issue. The conversion does not happen
>> correct.
>>> I can see missing text / images etc.
>>>
>>> Please note that this happens only in a few machines. I use Ubuntu and
>>> OpenOffice. I have tried with a variety of combinations for difference
>>> version of Ubuntu and Openoffice (and even LibreOffice)
>>>
>>> However I am unable to find out why it does not work on some machines.
>>>
>>> Can anyone please help?
>>>
>>> Thanks,
>>> Robin

BR
Andreas Lehmkühler


Mime
View raw message