pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: Strange character
Date Mon, 02 Nov 2015 07:48:15 GMT
Iterate over each character and check if Character.UnicodeBlock.of(char) is
equal to Character.UnicodeBlock.PRIVATE_USE_AREA. If so, omit the
character.

— John

> On 1 Nov 2015, at 21:10, srinath prathi <srinath.prathi@gmail.com> wrote:
> 
> Thank you for the information. How to remove it? When I replaced it with
> "", it is not working. I want it to be removed. Can you please help me in
> it?
> 
> 
> 
> 
> Yours Sincerely
> Srinath
> 
> On Mon, Nov 2, 2015 at 12:23 AM, John Hewson <john@jahewson.com> wrote:
> 
>> Indeed it is. The character which you’ve pasted in the e-mail below is
>> U+F0B7,
>> which is a private use code point:
>> 
>> https://codepoints.net/U+F0B7?lang=en <
>> https://codepoints.net/U+F0B7?lang=en>
>> 
>> This means that the PDF contains some private text encoding which, while
>> you
>> can recognise the characters on the screen, doesn’t correspond to any
>> usable
>> text as far as the encoded content goes. This is not uncommon for PDF.
>> 
>> — John
>> 
>> 
>>> On 1 Nov 2015, at 02:39, Olaf Drümmer <olaflist@callassoftware.com>
>> wrote:
>>> 
>>> Hi Srinath,
>>> 
>>> this is the so called “.notdef" replacement glyph you typically get when
>> rendering text with a font, where that font does not contain the glyph
>> needed to render a given character.
>>> 
>>> Olaf
>>> 
>>> 
>>>> On 01.11.2015, at 10:23, srinath prathi <srinath.prathi@gmail.com>
>> wrote:
>>>> 
>>>> Dear All
>>>> What is this character  ? i get that while stripping the a pdf. How to
>>>> treat it?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Yours Sincerely
>>>> Srinath
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message