pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Uppercase letters are read in lowercase manner
Date Fri, 22 Mar 2013 07:28:55 GMT
Hi,

Am 21.03.2013 08:08, schrieb Maruan Sahyoun:
> Hi Hesham,
>
> the text in question is defined as marked content in the PDF and not as 'regular text'.
 > I think its wrongly handled/not fully supported (I don't know what the 
implementation status is)
 > in pdfbox (and some other apps I tested with) but is correctly handled in 
Adobe Reader.
That's correct, the pdf uses marked content to replace a string (14.9.4 
Replacement Text of the PDF specs provides a simple example). And
yes, PDFBox doesn't support it, yet.

> Kind regards
>
> Maruan Sahyoun
>
> Am 21.03.2013 um 07:05 schrieb Hesham G. <heshamgneady@gmail.com>:
>
>> Andreas ,
>>
>> I apologize for this !
>> Please download the PDF from here :
>> https://dl.dropbox.com/u/10111483/downloads/pdfbox/pdf_with_uppercase_letters.pdf
>>
>>
>> Best regards ,
>> Hesham
>>
>> ---------------------------------------------
>> Included message :
>>
>>> Hi,
>>>
>>> Am 18.03.2013 15:43, schrieb Hesham G.:
>>>> Hello ,
>>>>
>>>> I have a PDF that when I read its contents using PDFBox some uppercase letters
are being read as lowercase. Please check this 1-page sample PDF :
>>>> http://www.4shared.com/office/JXrLadN8/pdf_with_uppercase_letters.html
>>> Do I have to sign up to download the pdf or did I miss the "magic" download button?
>>>
>>>> For example :
>>>> - Word "Testing" is read as "testing"
>>>> - Word "Eve" is read as "eve"
>>>> - Word "Deuteronomy" is read as "deuteronomy"
>>>>
>>>> Is there a reason for this ?
>>>>
>>>>
>>>> Best regards ,
>>>> Hesham
>>>
>>>
>>> BR
>>> Andreas Lehmkühler

BR
Andreas Lehmkühler

Mime
View raw message