pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darcy Dechene <darcy.dech...@gmail.com>
Subject Re: PDFToImage results in missing text
Date Thu, 08 Aug 2013 16:25:38 GMT
Yes that is the issue. I'll monitor PDFBOX-1608 and for now we will roll
back to Java pre 1.7.0_21

Thanks for the quick response.

Darcy


On Thu, Aug 8, 2013 at 10:05 AM, Andreas Lehmkuehler <andreas@lehmi.de>wrote:

> Hi,
>
> Am 08.08.2013 16:58, schrieb Darcy Dechene:
>
>  Hi,
>>
>> When converting the attached pdf file using the command line tool
>> PDFToImage the
>> resulting image is missing all the text (attached test_J25.jpg).
>>
> Your attachment didn't make it due to some restrictions to the mailing
> list.
> But I guess we won't need an image demonstrating that something is
> missing. ;-)
>
>
>  This is being run on a Windows system and occurs when using Java 7
>> (update 21 or
>> 25). When running on the same Windows system using Java 7 update 11 the
>> resulting image is just fine and contains all the text (attached
>> test_J11.jpg).
>>
> Sounds like your pdf uses type1C/CFF fonts. That's a known issue and
> PDFBOX-1608 [1] deals with it.
>
>
>  Does anyone know what might cause this or has ideas for further debugging?
>>
> You may double check if I'm right. Open the pdf in question using acrobat
> and
> have a look at the documents properties (File -> Properites). The font
> folder
> lists all used fonts. You may see one or more entries for an embedded
> subset of
> a type1 font.
>
>  Thanks!
>> Darcy
>>
>
> BR
> Andreas Lehmkühler
> [1] https://issues.apache.org/**jira/browse/PDFBOX-1608<https://issues.apache.org/jira/browse/PDFBOX-1608>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message