pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Issues with extraction content of PDF files
Date Mon, 21 Dec 2015 08:12:04 GMT
Am 21.12.2015 um 04:08 schrieb Zheng Lin Edwin Yeo:
> Thanks for your reply.
>
> I tried on Adobe Acrobat Pro DC, it is able to open the file, but if open
> on Adobe Reader then it is not able to extract all the text properly.
>
> Is there anyway which we can check what type of encoding is used for the
> PDF files?

Yes, in the font dictionaries, as you can see from this screenshot:



However this won't get you the text, obviously.

Tilman

>
> Regards,
> Edwin
>
>
>
>
> On 19 December 2015 at 03:07, Tilman Hausherr <THausherr@t-online.de> wrote:
>
>> Am 18.12.2015 um 18:57 schrieb Zheng Lin Edwin Yeo:
>>
>>> I've shared one of the file with the issue on dropbox, which you can
>>> access
>>> via the link here:
>>> https://www.dropbox.com/s/rufi9esmnsmzhmw/Desmophen%2B670%2BBAe.pdf?dl=0
>>>
>> Adobe Reader is also unable to extract text.
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


Mime
View raw message