pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: Weird issue with fonts in input fields after merging
Date Fri, 09 Oct 2015 18:05:52 GMT
Hi,

> Am 09.10.2015 um 18:28 schrieb Johannes Barre <johannes.barre@billfront.com>:
> 
> Hello Maruan!
> 
> I don't want to push you, but even if you couldn't figure out everything,
> also intermediate results or even just ideas could be helpful.
> 
> I experimented a bit more and found, that setValue sometimes works with
> umlauts and sometimes doesn't when I use setValue instead of my hack. So,
> with the BIW_FORM.pdf, they are scrambled, but with the umlauts_ok.pdf,
> they are fine. Any idea, why they are scrambled in the BIW_FORM.pdf? Do I
> need to convert the character encoding? How do I detect which encoding is
> required?
> 

the 1.8 (and previous) version were not really dealing correctly with character encodings
specially if the font is subset. 2.0 does that correctly. I did a quick hack to support encode()
for Type 1 C fonts and after that your form works fine even with the umlaut.

I've created https://issues.apache.org/jira/browse/PDFBOX-3016 <https://issues.apache.org/jira/browse/PDFBOX-3016>
for that.

BR
Maruan

> If I could fix this issue, I probably could use setValue. When using
> setValue, the values show up in Acrobat Reader even when I merge the
> documents :D
> 
> Greets, Johannes
> 
> On Thu, Oct 8, 2015 at 4:35 PM, Maruan Sahyoun <sahyoun@fileaffairs.de>
> wrote:
> 
>> Hi,
>> 
>>> Am 08.10.2015 um 16:31 schrieb Johannes Barre <
>> johannes.barre@billfront.com>:
>>> 
>>> Hello!
>>> 
>>> I just tried the 2.0 snapshot from yesterday and get this error:
>>> 
>>> org/apache/pdfbox/pdmodel/font/PDType1CFont.java:283:in `encode':
>>> java.lang.UnsupportedOperationException: Not implemented: Type1C
>> 
>> there is already a ticket for that.
>> 
>> BR Maruan
>> 
>>> 
>>> Is that also true for 1.8.10 (just without the error) and could it be
>>> related to the problem?
>>> 
>>> Greets, Johannes
>>> 
>>> PS: I've also pushed a Java version of my code to the gist. It's probably
>>> as messy as my JRuby version, they're just experiments.
>>> 
>>> On Thu, Oct 8, 2015 at 3:41 PM, Johannes Barre <
>> johannes.barre@billfront.com
>>>> wrote:
>>> 
>>>> Hello Maruan!
>>>> 
>>>> Thank again. I hope my last answer didn't sounded too aggressive
>> (written
>>>> communication is difficult). I'm grateful for any help!
>>>> 
>>>> You brought up a good point, as a Linux user I've only checked with
>> Google
>>>> Chrome & xpdf (and I was referring to the xpdf). In the Acrobat Reader
9
>>>> (Linux) and XI (Win XP), the field values are not shown. So I got a new
>>>> problem :'(
>>>> 
>>>> Greets, Johannes
>>>> 
>>>> On Thu, Oct 8, 2015 at 3:00 PM, Maruan Sahyoun <sahyoun@fileaffairs.de>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>>>> Am 08.10.2015 um 14:53 schrieb Johannes Barre <
>>>>>> johannes.barre@billfront.com>:
>>>>>> 
>>>>>> Hello Maruan!
>>>>>> 
>>>>>> Thank you for your reply.
>>>>>> 
>>>>>> So, basically you say, the source PDFs aren't valid already? I've
>> asked
>>>>> and
>>>>>> they were created with Adobe InDesign, I would hope that Adobe knows
>>>>> how to
>>>>>> generate valid PDFs. :-/
>>>>> 
>>>>> the PDFs are not invalid - that's not what I wanted to say.
>>>>> 
>>>>>> But even so, why is everything looking good when I just fill in the
>>>>> fields
>>>>>> without merging it? It has the same issue with the fonts name and
I
>>>>> filled
>>>>>> the field with the same method.
>>>>> 
>>>>> when you say looking good - are you looking at it with Adobe Reader or
>>>>> XPDF or ….
>>>>> 
>>>>> I can have a more in-depth look tonight - my comments were about the
>>>>> quick observations I made.
>>>>> 
>>>>> BR
>>>>> Maruan
>>>>> 
>>>>>> Greets, Johannes
>>>>>> 
>>>>>> On Thu, Oct 8, 2015 at 2:35 PM, Maruan Sahyoun <
>> sahyoun@fileaffairs.de>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>>>> Am 08.10.2015 um 13:30 schrieb Johannes Barre <
>>>>>>>> johannes.barre@billfront.com>:
>>>>>>>> 
>>>>>>>> Hello!
>>>>>>>> 
>>>>>>>> I have a weird issue. So, I have to PDFs. When I fill form
fields in
>>>>> one
>>>>>>> of
>>>>>>>> them and save, everything looks fine. However, when I append
this
>>>>> filled
>>>>>>>> PDF to another one, xpdf doesn't display the values anymore
and
>>>>> complains
>>>>>>>> about missing fonts:
>>>>>>>> 
>>>>>>>> Syntax Error: Unknown font tag 'ProximaNova-Regular'
>>>>>>>> Syntax Error: Unknown font in field's DA string
>>>>>>>> Syntax Error: Unknown font tag 'ProximaNova-Regular'
>>>>>>>> Syntax Error: Unknown font in field's DA string
>>>>>>>> 
>>>>>>>> I'm using JRuby (9k), but I hope it's understandable for
you. I put
>>>>> the
>>>>>>>> source & PDFs in this gist:
>>>>>>>> https://gist.github.com/iGEL/a8484f0bc44b03fa9de1 (Will delete
it
>>>>> later,
>>>>>>>> once the issue is solved)
>>>>>>>> 
>>>>>>>> Other specs: pdfbox-app-1.8.10, openjdk 1.8.0_66, Debian
Jessy
>> inside
>>>>> of
>>>>>>>> Docker
>>>>>>>> 
>>>>>>>> As you can see, I use a special way to set the values. I
had
>> problems
>>>>>>> with
>>>>>>>> German umlauts using setValue and it also sometimes fails
(Possibly
>>>>>>> related
>>>>>>>> to https://issues.apache.org/jira/browse/PDFBOX-1550, the
message
>> is
>>>>> the
>>>>>>>> same as in that bug)
>>>>>>> 
>>>>>>> setting the field value directly using
>>>>>>> 
>>>>>>> form.getField(name).getDictionary.setItem(
>>>>>>>  Java::OrgApachePdfboxCos::COSName::V,
>>>>>>>  Java::OrgApachePdfboxCos::COSString.new(value)
>>>>>>> )
>>>>>>> 
>>>>>>> will not update the visual appearance of the filed and as a result
>> the
>>>>>>> newly set value is not visible
>>>>>>> 
>>>>>>> 
>>>>>>>> The COVER_PAGE.pdf and BIW_FORM.pdf are the templates I'm
using,
>>>>>>>> form_filled.pdf is just the BIW_FORM.pdf with 2 fields filled
and
>>>>> merged
>>>>>>> is
>>>>>>>> COVER_PAGE.pdf and form_filled.pdf merged together.
>>>>>>>> 
>>>>>>>> The p in line 15 and 22 print out the DA value of the field
and it's
>>>>> the
>>>>>>>> same for both files:
>>>>>>>> 
>>>>>>>> "/ProximaNova-Regular 9 Tf 0.019 0.305 0.627 rg" # form_filled.pdf
>>>>>>>> "/ProximaNova-Regular 9 Tf 0.019 0.305 0.627 rg" # merged.pdf
>>>>>>> 
>>>>>>> the font resource is called /ProximaNova-Regular but that's not
in
>> your
>>>>>>> PDF as the font which is in your PDF is called
>>>>> /MHGLSX+ProximaNova-Regular.
>>>>>>> In addition the issue with a font subset is that only certain
>>>>> characters
>>>>>>> are part of that subset. As a result some of the characters you
need
>> to
>>>>>>> display your field value might not be within the subset.
>>>>>>> 
>>>>>>> BR
>>>>>>> Maruan
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> This font is according to pdffonts in both files:
>>>>>>>> 
>>>>>>>> $ pdffonts form_filled.pdf
>>>>>>>> name                                 type              encoding
>>>>>>> emb
>>>>>>>> sub uni object ID
>>>>>>>> ------------------------------------ -----------------
>>>>> ----------------
>>>>>>> ---
>>>>>>>> --- --- ---------
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    124  0
>>>>>>>> *MHGLSX+ProximaNova-Regular           Type 1C           WinAnsi
>>>>>>>> yes yes yes    125  0*
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    126  0
>>>>>>>> MHGLSX+Facit-Bold                    Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    127  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    218  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    219  0
>>>>>>>> ProximaNova-Bold                     Type 1C (OT)      Custom
>>>>>>> yes
>>>>>>>> no  no       8  0
>>>>>>>> ProximaNova-Light                    Type 1C (OT)      Custom
>>>>>>> yes
>>>>>>>> no  no       9  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    251  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    252  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    254  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    255  0
>>>>>>>> FJORTL+ProximaNova-Light             CID Type 0C       Identity-H
>>>>>>> yes
>>>>>>>> yes yes    165  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    259  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    260  0
>>>>>>>> 
>>>>>>>> $pdffonts merged.pdf
>>>>>>>> name                                 type              encoding
>>>>>>> emb
>>>>>>>> sub uni object ID
>>>>>>>> ------------------------------------ -----------------
>>>>> ----------------
>>>>>>> ---
>>>>>>>> --- --- ---------
>>>>>>>> AYOVHV+Facit-Bold                    Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    131  0
>>>>>>>> AYOVHV+ProximaNova-Bold              Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    132  0
>>>>>>>> AYOVHV+ProximaNova-Light             Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    133  0
>>>>>>>> AYOVHV+ProximaNova-Semibold          Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    134  0
>>>>>>>> ProximaNova-Light                    Type 1C (OT)      Custom
>>>>>>> yes
>>>>>>>> no  no       9  0
>>>>>>>> AYOVHV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes no     192  0
>>>>>>>> AYOVHV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes no     193  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    275  0
>>>>>>>> *MHGLSX+ProximaNova-Regular           Type 1C           WinAnsi
>>>>>>>> yes yes yes    276  0*
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    277  0
>>>>>>>> MHGLSX+Facit-Bold                    Type 1C           Custom
>>>>>>> yes
>>>>>>>> yes yes    278  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    437  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    438  0
>>>>>>>> ProximaNova-Bold                     Type 1C (OT)      Custom
>>>>>>> yes
>>>>>>>> no  no     462  0
>>>>>>>> ProximaNova-Light                    Type 1C (OT)      Custom
>>>>>>> yes
>>>>>>>> no  no     512  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    500  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    501  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    503  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    504  0
>>>>>>>> FJORTL+ProximaNova-Light             CID Type 0C       Identity-H
>>>>>>> yes
>>>>>>>> yes yes    377  0
>>>>>>>> NPQRGV+ProximaNova-Bold              Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    451  0
>>>>>>>> NPQRGV+ProximaNova-Light             Type 1C           WinAnsi
>>>>>>> yes
>>>>>>>> yes yes    452  0
>>>>>>>> 
>>>>>>>> Why are the field values not showing up and how can I fix
that?
>>>>>>>> 
>>>>>>>> Thanks for your help!
>>>>>>>> 
>>>>>>>> Johannes
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message