Return-Path: X-Original-To: apmail-pdfbox-dev-archive@www.apache.org Delivered-To: apmail-pdfbox-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0640A18B73 for ; Tue, 1 Dec 2015 17:40:51 +0000 (UTC) Received: (qmail 72446 invoked by uid 500); 1 Dec 2015 17:34:11 -0000 Delivered-To: apmail-pdfbox-dev-archive@pdfbox.apache.org Received: (qmail 72374 invoked by uid 500); 1 Dec 2015 17:34:11 -0000 Mailing-List: contact dev-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pdfbox.apache.org Delivered-To: mailing list dev@pdfbox.apache.org Received: (qmail 72345 invoked by uid 99); 1 Dec 2015 17:34:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Dec 2015 17:34:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F246A2C1F61 for ; Tue, 1 Dec 2015 17:34:10 +0000 (UTC) Date: Tue, 1 Dec 2015 17:34:10 +0000 (UTC) From: "John Hewson (JIRA)" To: dev@pdfbox.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034142#comment-15034142 ] John Hewson commented on PDFBOX-3138: ------------------------------------- The embedded font used by the field does indeed contain Hebrew glyphs, and a valid "cmap" table which can be used to look up those glyphs. The mentioned character, U+05D7, is indeed is present in the font. The embedded font file is in OpenType format, however the PDF Font dictionary is Type1 and specifies WinAnsiEncoding, which does not include Hebrew characters. So, strictly speaking, the field cannot be filled using any non-ANSI characters and so PDFBox's behaviour is correct. It would seem that PDFBox could so something more helpful in this instance. Filling the form with Acrobat results in the font from the form's DR being overridden in the Field itself with a new CIDFontType0 which has been created from the DR font. Ideally we would do that. Do you have any control over the software producing these fields? I might be able to offer a workaround. > PDTextField doesn't accept any Hebrew characters as new value > ------------------------------------------------------------- > > Key: PDFBOX-3138 > URL: https://issues.apache.org/jira/browse/PDFBOX-3138 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, FontBox > Affects Versions: 2.0.0 > Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05 > Reporter: Gilad Denneboom > Priority: Minor > Fix For: 2.1.0 > > Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt > > > Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField fails with the following exception: > {code} > Exception in thread "main" java.lang.IllegalArgumentException: No glyph for U+05D7 in font AdobeHebrew-Regular > at org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300) > at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283) > at org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341) > at org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213) > at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373) > at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237) > at org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144) > at org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263) > at org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221) > at org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218) > at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22) > {code} > I've tried using multiple fonts for the field, all of which can handle Hebrew characters just fine, and got the same results in all of them. > See attached files for a demonstration of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org