pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Ternes <KTer...@thegeneral.com>
Subject RE: How flatten without changing appearance
Date Mon, 21 Sep 2015 18:37:13 GMT
Maruan,

I finally got back around to trying this new flatten method out.  It works fantastic as far
as preserving the appearance of the fields.
However, if I flatten before a merge, I am still losing data entered into fields.


-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Tuesday, September 15, 2015 4:15 AM
To: users@pdfbox.apache.org
Subject: Re: How flatten without changing appearance

Hi,
> Am 15.09.2015 um 09:06 schrieb Tilman Hausherr <THausherr@t-online.de>:
> 
> Am 15.09.2015 um 09:02 schrieb Maruan Sahyoun:
>>> Am 15.09.2015 um 08:56 schrieb Tilman Hausherr <THausherr@t-online.de>:
>>> 
>>> Am 15.09.2015 um 08:48 schrieb Maruan Sahyoun:
>>>> Hi Tilman,
>>>> 
>>>>> Am 15.09.2015 um 08:17 schrieb Tilman Hausherr <THausherr@t-online.de>:
>>>>> 
>>>>> Am 15.09.2015 um 04:47 schrieb Maruan Sahyoun:
>>>>>> Hi Kevin,
>>>>>> 
>>>>>> I've created https://issues.apache.org/jira/browse/PDFBOX-2970 <https://issues.apache.org/jira/browse/PDFBOX-2970>
for that.
>>>>>> 
>>>>>> In addition you can start playing with that code based on the current
PDFBox 2.0.0 snapshot.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>         PDDocument doc = PDDocument.load(new File(...));
>>>>>>         PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
>>>>>>                  // Iterate over all form fields and their widgets
and create a
>>>>>>         // FormXObject at the page content level from that
>>>>>>         for (PDField field : acroForm.getFieldTree())
>>>>>>         {
>>>>>>             for (PDAnnotationWidget widget : ((PDTerminalField)field).getWidgets())
>>>>>>             {
>>>>>>                 PDPage page = widget.getPage();
>>>>>>                 PDPageContentStream contentStream = new 
>>>>>> PDPageContentStream(doc, page, true, true);
>>>>> I wasn't 100% sure what was meant with "flatten", but now I get it!
>>>>> 
>>>>> I think the first one of the "new PDPageContentStream" should have the
fifth parameter and be true, in case the existing content stream isn't included in q...Q.
>>>> thanks for stepping in.
>>>> 
>>>> Is that really needed as we are storing/restoring the current state before
the new content is added. Putting a store/restore around the existing one - assuming there
is only one - would store/restore to the state before there was any content stream. In addition
if there are already multiple streams before the new one would we need to put a store/restore
around each of them? If there wasn't any that would potentially change the behavior of the
old processing?
>>> The constructor identifies if there is already content, and only then, it inserts
a saveGraphicsState() before the first existing stream and a restoreGraphicsState() at the
end of the new one, so it work both with none, one, and many.
>>> 
>>> Only doing it for the first one is to avoid having too many nesting 
>>> q...Q
>> looking at how Adobe does it each of the new content streams is wrapped in a q ...Q
- that's why I did the same.
> 
> That isn't what I meant. I meant that only the first one should be
> 
> PDPageContentStream contentStream = new PDPageContentStream(doc, page, 
> true, true, true);
> 
> instead of
> 
> PDPageContentStream contentStream = new PDPageContentStream(doc, page, 
> true, true);
> 
> all the rest is fine as it is.
> 

done - thanks for the clarification.
Maruan

> 
> Tilman
> 
> 
> 
>> 
>>> What I was thinking of is cases where the existing content stream(s) modifies
the CTM without resetting it. This would have the effect that the "fields" would appear at
the wrong place / wrong size / outside of the screen. We're having this every few months that
somebody wants to add a stamp to a page, it works for most, except for one "special" file.
>>> 
>>> Tilman
>>> 
>>> 
>>>> BR
>>>> Maruan
>>>> 
>>>> 
>>>> 
>>>>> Tilman
>>>>> 
>>>>>>                 PDFormXObject fieldObject = new PDFormXObject(widget.getAppearance().getNormalAppearance().getAppearanceStream().getCOSStream());
>>>>>>                                  Matrix translationMatrix = Matrix.getTranslateInstance(widget.getRectangle().getLowerLeftX(),
widget.getRectangle().getLowerLeftY());
>>>>>>                 contentStream.saveGraphicsState();
>>>>>>                 contentStream.transform(translationMatrix);
>>>>>>                 contentStream.drawForm(fieldObject);
>>>>>>                 contentStream.restoreGraphicsState();
>>>>>>                 contentStream.close();
>>>>>>             }
>>>>>>         }
>>>>>> 
>>>>>>         // preserve all non widget annotations
>>>>>>         for (PDPage page : doc.getPages())
>>>>>>         {
>>>>>>             List<PDAnnotation> annotations = new ArrayList<PDAnnotation>();
>>>>>>                          for (PDAnnotation annotation: page.getAnnotations())
>>>>>>             {
>>>>>>                 if (!(annotation instanceof PDAnnotationWidget))
>>>>>>                 {
>>>>>>                     annotations.add(annotation);
>>>>>>                 }
>>>>>>             }
>>>>>>             page.setAnnotations(annotations);
>>>>>>         }
>>>>>>                  // remove the fields
>>>>>>         acroForm.setFields(Collections.<PDField>emptyList());
>>>>>>                  doc.save(...);
>>>>>>         doc.close();
>>>>>> 
>>>>>> 
>>>>>> BR
>>>>>> Maruan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Am 14.09.2015 um 17:57 schrieb Maruan Sahyoun <sahyoun@fileaffairs.de>:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Am 14.09.2015 um 16:53 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>>>>>>>> 
>>>>>>>> I am trying to refactor for 2.0.0-SNAPSHOT now.
>>>>>>>> 
>>>>>>>> Can someone tell me how to flatten a PDDocument in PDFBox
2 ?
>>>>>>>> All the examples seem to pertain to v1.x
>>>>>>> there are non but I've started to get that into the project
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
>>>>>>>> Sent: Saturday, September 12, 2015 1:55 AM
>>>>>>>> To: users@pdfbox.apache.org
>>>>>>>> Subject: Re: How flatten without changing appearance
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>>> Am 12.09.2015 um 00:08 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>>>>>>>>> 
>>>>>>>>> Maruan,
>>>>>>>>> That would be great.  Please have a look at:
>>>>>>>>> https://onedrive.live.com/redir?resid=9CCA324BE57ADA7!76929&au
>>>>>>>>> thkey=!A KE0x0fh5QDkIIw&ithint=file%2czip This should

>>>>>>>>> demonstrate the problem by processing several files.
>>>>>>>> I took a quick look. Which version of PDFBox are you using?
Would it be possible to go to (the yet to be released) 2.0.0 version?
>>>>>>>> 
>>>>>>>> BR
>>>>>>>> Maruan
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> The users complaint is: Boxes are being greyed out and
font sizes are being changed by the flatten.
>>>>>>>>> My flatten is done by the PDFBoxUtils.flatten() method.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> -Kevin
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
>>>>>>>>> Sent: Friday, September 11, 2015 12:17 PM
>>>>>>>>> To: users@pdfbox.apache.org
>>>>>>>>> Subject: Re: How to merge forms where there is
>>>>>>>>> 
>>>>>>>>> how urgent is it for you? If you could share a sample
file and your code for flattening I could take a look.
>>>>>>>>> 
>>>>>>>>> BR
>>>>>>>>> Maruan
>>>>>>>>> 
>>>>>>>>>> Am 11.09.2015 um 18:23 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>>>>>>>>>> 
>>>>>>>>>> I have not been able to successfully flatten a document
without affecting the formatting of the field.
>>>>>>>>>> There are some example codes out there but none of
them work correctly for me.
>>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
>>>>>>>>>> Sent: Friday, September 11, 2015 3:15 AM
>>>>>>>>>> To: users@pdfbox.apache.org
>>>>>>>>>> Subject: Re: How to merge forms where there is
>>>>>>>>>> 
>>>>>>>>>> Hi Kevin
>>>>>>>>>> None of these field values will need to be changed
afterward the merge.  They are set to read-only.
>>>>>>>>>>> I tried flattening the source fields but most
of these documents rely on field annotation for the field value formatting and this is not
to be changed.
>>>>>>>>>> when a field is flattened correctly the formatting
of the annotations visually representing the field become part of the page content stream.
So if you don't need the fields at all - as there is no further input - you could flatten
the source documents prior to merging them.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message