pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler <andr...@lehmi.de>
Subject Re: UTF16 encoded string to PDFDocEncoding
Date Tue, 11 Jul 2017 10:17:35 GMT

> Andrea Vacondio <andrea.vacondio@gmail.com> hat am 10. Juli 2017 um 19:22 geschrieben:
> 
> 
> Hi, we came across this case where we are basically cloning outline items
> where the original outline title is a UTF16BE encoded text string
> containing the value 00A0 (non break space). We later use the string to
> assign the title in a new outline item and the A0 is recognised as a € sign.
> Here is a simple test:
> 
>         COSString victim = COSString
>                 .parseHex("FEFF004300680061007000740065007200A0");
>         PDOutlineItem node = new PDOutlineItem();
>         node.setTitle(victim.getString());
> 
> If you look at the node dictionary you'll see that the title value is
> Chapter€
How do you look at the dictionary?

The following code:
COSString victim = COSString.parseHex( "FEFF004300680061007000740065007200A0" );
			System.out.println( victim.toHexString() );
			System.out.println( victim.getString() );

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message