Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DA036200CC5 for ; Tue, 11 Jul 2017 16:57:36 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D88D2166196; Tue, 11 Jul 2017 14:57:36 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 29F98166191 for ; Tue, 11 Jul 2017 16:57:36 +0200 (CEST) Received: (qmail 50307 invoked by uid 500); 11 Jul 2017 14:57:35 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 50296 invoked by uid 99); 11 Jul 2017 14:57:35 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Jul 2017 14:57:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 991FC195E13 for ; Tue, 11 Jul 2017 14:57:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.479 X-Spam-Level: * X-Spam-Status: No, score=1.479 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id TJuBeSniGhK8 for ; Tue, 11 Jul 2017 14:57:33 +0000 (UTC) Received: from mailout04.t-online.de (mailout04.t-online.de [194.25.134.18]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A422A62D34 for ; Tue, 11 Jul 2017 14:57:32 +0000 (UTC) Received: from fwd25.aul.t-online.de (fwd25.aul.t-online.de [172.20.26.130]) by mailout04.t-online.de (Postfix) with SMTP id 434D841B18D6 for ; Tue, 11 Jul 2017 16:57:26 +0200 (CEST) Received: from [192.168.2.105] (S3+oZmZeYhKtV1cXDW+byDLy5t3mFBqz7pxKUbwx9wV-tJpZRWjOgSVk4BjsgPvw5X@[217.231.152.50]) by fwd25.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1dUwbE-1Gi6xU0; Tue, 11 Jul 2017 16:57:16 +0200 Subject: Re: UTF16 encoded string to PDFDocEncoding To: users@pdfbox.apache.org References: <2bb48b14-12e8-7c86-5418-b78048aa2c37@t-online.de> From: Tilman Hausherr Message-ID: Date: Tue, 11 Jul 2017 16:58:16 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <2bb48b14-12e8-7c86-5418-b78048aa2c37@t-online.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-ID: S3+oZmZeYhKtV1cXDW+byDLy5t3mFBqz7pxKUbwx9wV-tJpZRWjOgSVk4BjsgPvw5X X-TOI-MSGID: 16287fa4-755c-4925-8990-d8e2a93dc694 archived-at: Tue, 11 Jul 2017 14:57:37 -0000 fixed in https://issues.apache.org/jira/browse/PDFBOX-3864 Tilman Am 11.07.2017 um 16:06 schrieb Tilman Hausherr: > The cause are "gaps" in the PDFDocEncoding specification that have > been missed in the implementation. I'll create an issue later. > > Tilman > > Am 10.07.2017 um 19:22 schrieb Andrea Vacondio: >> Hi, we came across this case where we are basically cloning outline >> items >> where the original outline title is a UTF16BE encoded text string >> containing the value 00A0 (non break space). We later use the string to >> assign the title in a new outline item and the A0 is recognised as a >> € sign. >> Here is a simple test: >> >> COSString victim = COSString >> .parseHex("FEFF004300680061007000740065007200A0"); >> PDOutlineItem node = new PDOutlineItem(); >> node.setTitle(victim.getString()); >> >> If you look at the node dictionary you'll see that the title value is >> Chapter€ >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org > For additional commands, e-mail: users-help@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org