pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Empty glyphs
Date Wed, 22 Jun 2016 20:53:35 GMT
Am 22.06.2016 um 22:50 schrieb Brzrk One:
> isn't that the ByteOrderMark?

No:
https://stackoverflow.com/questions/10310210/is-ef-bf-bf-an-allowed-character-in-xml-utf-8
https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

Tilman


>
> On Wed, Jun 22, 2016 at 2:03 PM, Tilman Hausherr <THausherr@t-online.de>
> wrote:
>
>>  From what I see, the "whitespace" are EF BF BF which is not a valid UTF8
>> character. Please upload the PDF file somewhere.
>>
>> Tilman
>>
>>
>> Am 22.06.2016 um 18:39 schrieb OYEBISI, Daniel:
>>
>>> The problem is with some of the whitespace that appears empty in Notepad
>>> but are really not.
>>> Please try opening the text file with other text editors.
>>> Thanks
>>>
>>> -----Message d'origine-----
>>> De : Tilman Hausherr [mailto:THausherr@t-online.de]
>>> Envoyé : mercredi 22 juin 2016 17:54
>>> À : users@pdfbox.apache.org
>>> Objet : Re: Empty glyphs
>>>
>>> Your PDF didn't get through (security) but this sounds like a N++ problem.
>>>
>>> I could display your txt file with the normal notepad, by changing the
>>> font to windings.
>>>
>>> Tilman
>>>
>>> Am 22.06.2016 um 16:58 schrieb OYEBISI, Daniel:
>>>
>>>> Hello,
>>>>
>>>> I came across an issue while trying to extract the text using
>>>> PDFTextStripper from the PDF file attached to this email.
>>>>
>>>> When I open the txt document generated in the Notepad, it appears
>>>> normal but when I open it with Notepad++ and it gives an interesting
>>>> result.
>>>>
>>>> Please can you have a look at this?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.
>>>
>> org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message