Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3705D200B58 for ; Wed, 13 Jul 2016 04:44:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 35703160A75; Wed, 13 Jul 2016 02:44:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4752C160A56 for ; Wed, 13 Jul 2016 04:44:40 +0200 (CEST) Received: (qmail 99141 invoked by uid 500); 13 Jul 2016 02:44:34 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 99121 invoked by uid 99); 13 Jul 2016 02:44:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jul 2016 02:44:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 4A703C8848 for ; Wed, 13 Jul 2016 02:44:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.591 X-Spam-Level: X-Spam-Status: No, score=-0.591 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, T_DKIM_INVALID=0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=neutral reason="invalid (public key: not available)" header.d=jahewson.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id eOLcOAoP_l-I for ; Wed, 13 Jul 2016 02:44:27 +0000 (UTC) Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id A41A35F24C for ; Wed, 13 Jul 2016 02:44:26 +0000 (UTC) Received: by mail-pf0-f173.google.com with SMTP id h14so13249019pfe.1 for ; Tue, 12 Jul 2016 19:44:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jahewson.com; s=google; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=YNMzjuIZGAOSkSd391yP5sp8/0iq3PluF1CRo6KKs/I=; b=JQ0xAEJJd8eMg3e889Cpx9v//UILBVLJg2dSO1y65uWAYILPUQEkDIeuthUvNknsCB +1Fyfrfi/PBoX9sTv+Y5rTaPshh8ZLP+u60r6RjqYgT0SRX757UgA7diQK4YwV+3/RGW hjIjilNE0T0z3BS5XFraseEHmmv4ILsNGOgFw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=YNMzjuIZGAOSkSd391yP5sp8/0iq3PluF1CRo6KKs/I=; b=Yv4TGb38FMc/6ZW9axZz9kDD3bo7tqLk8L+FCZA9Y82+RSFdDmZfXa7bDAhVvA5UAE R2B3UMMeKmnfJNZZ+zPxK9hpQWH3T3d6igb4vop6UvG8pVZUU2osnKwUVzaFvAUT653c B/3UxGmpvYgjUm9s8YAG/V5lWwfobe0aVWY362cd0XptpPXXdLxPRZffThM3lygb9cA9 CRnRp0AEtNS4GIWqhae6DYtC644FPDNilMiyp3teSY7iDRf3peJ4C8ktNpGkn293lK6g DONYxAZSrfUAeIsdBTnuq5OEbb/xSDYfGNx8Byo2Ep29i0bj9/slv1cw6cXr+MT/TMEa bCZw== X-Gm-Message-State: ALyK8tKEMAQXHM+K+Vmvbcqit95Jx0/xYg5TZGFkDF4MlZ+XRkFHkp+sCGTSr86nKdRxdQ== X-Received: by 10.98.4.193 with SMTP id 184mr50115511pfe.98.1468377864781; Tue, 12 Jul 2016 19:44:24 -0700 (PDT) Received: from [10.0.1.12] (c-73-202-194-89.hsd1.ca.comcast.net. [73.202.194.89]) by smtp.gmail.com with ESMTPSA id g189sm7136111pfc.46.2016.07.12.19.44.23 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 12 Jul 2016 19:44:24 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: Problem reading PDF with Type1 font From: John Hewson In-Reply-To: <5784FDF6.3020300@thomas-letsch.de> Date: Tue, 12 Jul 2016 19:44:22 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <181A6240-572B-4C4B-B7BB-DAB8FDDF9669@jahewson.com> References: <5784DFFA.5050206@thomas-letsch.de> <3a071e07-6cdc-5d2a-12b5-e58aec2dfa62@lehmi.de> <5784EE3C.6000709@thomas-letsch.de> <3f83fa6d-2418-e4ea-baba-771fe0496eb1@lehmi.de> <5784FDF6.3020300@thomas-letsch.de> To: users@pdfbox.apache.org X-Mailer: Apple Mail (2.3124) archived-at: Wed, 13 Jul 2016 02:44:41 -0000 > On 12 Jul 2016, at 07:25, Thomas Letsch = wrote: >=20 > Am 12.07.2016 um 16:15 schrieb Andreas Lehmkuehler: >> Am 12.07.2016 um 15:18 schrieb Thomas Letsch: >>> Hi Andreas, >>>=20 >>> thanks for your answer. >>>=20 >>> Am 12.07.2016 um 15:01 schrieb Andreas Lehmkuehler: >>>> Hi, >>>>=20 >>>> Am 12.07.2016 um 14:18 schrieb Thomas Letsch: >>>>> Hi, >>>>>=20 >>>>> I am reading a PDF file with an embedded type1 font. I am getting = an >>>>> IOException during parsing of the PDF (I removed the name of the = font >>>>> for legal reasons): >>>> Why not, is it a secret font? >>> Actually I don't know. The whole PDF is confidential, so any part of = it >>> is probably, too. And its not a common font you find on the web. >> But maybe some of the devs have access to that font. Saying that, = what >> is the name of that specific font? > You are right, probably me being too strict. Its called = ARTWAB+Helvetica. That=E2=80=99s a subset. So we=E2=80=99re going to need that actual font = file. You can extract it from the PDF using our GUI-baed PDFDebugger . Navigate to the page in question, and = find the Font=20 resource with that name. Right-click on the FontFile resource in the = tree and save the stream to a .pfb file. Then send us that file. =E2=80=94 John >>=20 >>>>=20 >>>>> PDType1Font [ERROR] Can't read the embedded Type1 font >>>>> java.io.IOException: Found Token[kind=3DSTART_ARRAY, text=3D[] but >>>>> expected >>>>> INTEGER >>>>>=20 >>>>> Unfortunately I cannot send you the PDF, but I can send you an = extract >>>>> with the (hopefully) interesting parts. >>>> I doesn't help. We need the font itself as the parser throws the >>>> exception when reading the font. >>> This is my fault, I didn't include the whole stack trace. Sorry, = here >>> it is: >>> java.io.IOException: Found Token[kind=3DSTART_ARRAY, text=3D[] but = expected >>> INTEGER >>> at = org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:754) >>> at >>> = org.apache.fontbox.type1.Type1Parser.readEncoding(Type1Parser.java:200) >>> at >>> = org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:128) >>> at = org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:61) >>> at >>> = org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:85) >>> at >>> = org.apache.pdfbox.pdmodel.font.PDType1Font.(PDType1Font.java:228) >>> at >>> = org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java= :62) >>>=20 >>> at >>> org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123) >>> at >>> = org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFo= ntAndSize.java:60) >>>=20 >>> at >>> = org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamE= ngine.java:815) >>>=20 >>> at >>> = org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDF= StreamEngine.java:472) >>>=20 >>> at >>> = org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEng= ine.java:446) >>>=20 >>> at >>> = org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngin= e.java:149) >>>=20 >>>=20 >>> This looks for my non-expert eyes like a problem in reading the >>> encoding. At least this is my hope, because then we perhaps can get >>> along without the font file. >> The parser has a problem with reading the _internal_ encoding of the >> font, which has nothing to do with the encoding within the pdf = itself. >> There are 2 possible issues, either the font is malformed or our >> parser has a bug and unfortunately we need the font itself to find = out. > Ok, I understand. >=20 > Best Regards, > Thomas >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org > For additional commands, e-mail: users-help@pdfbox.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org