Return-Path: X-Original-To: apmail-pdfbox-dev-archive@www.apache.org Delivered-To: apmail-pdfbox-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BF8BD10A21 for ; Thu, 5 Jun 2014 21:02:02 +0000 (UTC) Received: (qmail 26322 invoked by uid 500); 5 Jun 2014 21:02:02 -0000 Delivered-To: apmail-pdfbox-dev-archive@pdfbox.apache.org Received: (qmail 26305 invoked by uid 500); 5 Jun 2014 21:02:02 -0000 Mailing-List: contact dev-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pdfbox.apache.org Delivered-To: mailing list dev@pdfbox.apache.org Received: (qmail 26294 invoked by uid 99); 5 Jun 2014 21:02:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Jun 2014 21:02:02 +0000 Date: Thu, 5 Jun 2014 21:02:02 +0000 (UTC) From: =?utf-8?Q?Andreas_Lehmk=C3=BChler_=28JIRA=29?= To: dev@pdfbox.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PDFBOX-62) Incorrect (zero) character widths returned in some docs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PDFBOX-62?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14019= 273#comment-14019273 ]=20 Andreas Lehmk=C3=BChler commented on PDFBOX-62: ------------------------------------------ The missing width values are now extracted from the true type font directly= . I've added the changes in revision http://svn.apache.org/r1600761 to the = trunk. I'm going to merge the changes to the 1.8 branch soon. > Incorrect (zero) character widths returned in some docs > ------------------------------------------------------- > > Key: PDFBOX-62 > URL: https://issues.apache.org/jira/browse/PDFBOX-62 > Project: PDFBox > Issue Type: Bug > Components: Rendering, Text extraction > Affects Versions: 1.8.5, 2.0.0 > Assignee: Andreas Lehmk=C3=BChler > Fix For: 2.0.0 > > Attachments: 5542.pdf, PDFBOX-2059_PDTrueTypeFont.diff, PDTrueTyp= eFont.diff, pdfbox-2006-zerowidth.pdf-1.png, pdfbox-62-zerowidth.pdf-1.png > > > [imported from SourceForge] > http://sourceforge.net/tracker/index.php?group_id=3D78314&atid=3D552832&a= id=3D1216674 > Originally submitted by tamirhassan on 2005-06-07 13:42. > For certain PDF documents (such as the one attached)=20 > the character/string widths (as obtained e.g. by the=20 > PDFont.getStringWidth method) are not returned=20 > correctly, i.e. they appear to be correct for punctuation=20 > characters but are zero for alphanumeric characters. =20 > It seems as if these alphanumeric characters are NOT=20 > within PDFont.firstChar and PDFont.lastChar in the=20 > Type 1 font. The method therefore attempts to obtain=20 > the font widths from the AFM (font metric) file, but fails=20 > (silently) with a 'resource is null' logline message. > (Note that this problem doesn't seem to occur with Type=20 > 1 fonts in other documents.) > A more detailed discussion regarding this issue can be=20 > found in this link: > http://sourceforge.net/forum/forum.php? > thread_id=3D1260349&forum_id=3D267205 > Thanks in advance for any help that can be obtained, > Tam -- This message was sent by Atlassian JIRA (v6.2#6252)