Return-Path: X-Original-To: apmail-pdfbox-dev-archive@www.apache.org Delivered-To: apmail-pdfbox-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0273F18230 for ; Wed, 2 Dec 2015 17:25:13 +0000 (UTC) Received: (qmail 23402 invoked by uid 500); 2 Dec 2015 17:25:11 -0000 Delivered-To: apmail-pdfbox-dev-archive@pdfbox.apache.org Received: (qmail 23381 invoked by uid 500); 2 Dec 2015 17:25:11 -0000 Mailing-List: contact dev-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pdfbox.apache.org Delivered-To: mailing list dev@pdfbox.apache.org Received: (qmail 23287 invoked by uid 99); 2 Dec 2015 17:25:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Dec 2015 17:25:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 60BA72C1F94 for ; Wed, 2 Dec 2015 17:25:11 +0000 (UTC) Date: Wed, 2 Dec 2015 17:25:11 +0000 (UTC) From: "John Hewson (JIRA)" To: dev@pdfbox.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PDFBOX-3150) IllegalArgumentException in getStringWidth/showText MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PDFBOX-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036188#comment-15036188 ] John Hewson commented on PDFBOX-3150: ------------------------------------- {quote} As this is a quite common scenario, I suggest you provide a possibility to provide a fallback codepoint that can be used for all non-printable characters. {quote} Our philosophy is that we won't return broken values for strings which we can't measure. {quote} another issue is, that "PDFont.encode (int)" is not public - it would help to change this as well. Shall I create a separate issue? {quote} It's deliberate because people were constantly misusing that method. We don't allow PDFBox to create broken PDFs. The PDF text model doesn't have a concept of newline characters, so that's something you need to address in your code - you're going to need to convert newlines into explicit text moving operators. If you have other Unicode characters which you're not sure if the font supports then you can retrieve the TTF's cmap via TrueTypeFont #getUnicodeCmap() then you can call getGlyphId(unicode) and if the result is non-zero then the glyph is present in the font. > IllegalArgumentException in getStringWidth/showText > --------------------------------------------------- > > Key: PDFBOX-3150 > URL: https://issues.apache.org/jira/browse/PDFBOX-3150 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 2.0.0 > Environment: 2.0.0-RC2 > Reporter: Philip Helger > Assignee: John Hewson > > I want to get the string width using a Type0 font. Because I'm using a character not in the font (e.g. '\n') I'm getting the following exception: > {code} > Exception in thread "main" java.lang.IllegalArgumentException: No glyph for U+000A in font OpenSans > at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.encode(PDCIDFontType2.java:401) > at org.apache.pdfbox.pdmodel.font.PDType0Font.encode(PDType0Font.java:351) > at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283) > at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:312) > {code} > As this is a quite common scenario, I suggest you provide a possibility to provide a fallback codepoint that can be used for all non-printable characters. > A similiar exception happens when trying to print the text via the PDPageContentStream: > {code} > Exception in thread "main" java.lang.IllegalArgumentException: No glyph for U+000A in font OpenSans > at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.encode(PDCIDFontType2.java:401) > at org.apache.pdfbox.pdmodel.font.PDType0Font.encode(PDType0Font.java:351) > at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283) > at org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341) > {code} > I finally ended up creating my own "font.encode" method (with a lot of other hacks) that basically does the following: > {code} > final byte [] aFallbackBytes = aFont.encode (nFallbackCodepoint); > byte [] aCPBytes; > try > { > // This method is package private > aCPBytes = aFont.encode (nCP); > } > catch (final IllegalArgumentException ex) { > aCPBytes = aFallbackBytes; > } > {code} > -> another issue is, that "PDFont.encode (int)" is not public - it would help to change this as well. Shall I create a separate issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org