pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <til...@apache.org>
Subject Re: AW: [bulk]: Re: Encoding of names dictionary and GoToE target
Date Fri, 19 Apr 2019 05:31:54 GMT
I meant upload to a sharehoster (attachments are deleted, except when stuck in moderation and
your second one wasn't), but never mind. I removed the JavaScript programmatically and found
this:

/Names [<FEFF0045006D0070007400790050006100670065005000440046002E007000640066> 15 0
R]

So the UTF16 is in the original file.

Tilman


On 2019/04/18 09:24:26, "Gueclue, Dahit" <Dahit.Gueclue@PROSTEP.com> wrote: 
> Here are the files without javascript. Also I used this code to produce the output:
> 
> import java.io.File;
> import java.io.IOException;
> import java.util.LinkedList;
> import java.util.List;
> import java.util.Map;
> 
> import org.apache.pdfbox.cos.COSName;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDEmbeddedFilesNameTreeNode;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.common.PDRectangle;
> import org.apache.pdfbox.pdmodel.common.filespecification.PDComplexFileSpecification;
> import org.apache.pdfbox.pdmodel.interactive.action.PDActionEmbeddedGoTo;
> import org.apache.pdfbox.pdmodel.interactive.action.PDTargetDirectory;
> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination;
> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageFitWidthDestination;
> 
> 
> public class AddAnnotations
> {
>     public static void main(String[] args) throws IOException
>     {
> 
>         File file = new File("PDF with 1 PDF doc attachment.pdf");
>         PDDocument document = PDDocument.load(file);
> 
>         PDPage page0 = document.getPage(0);
>         List<PDAnnotation> annotations = page0.getAnnotations();
> 
>         try
>         {
> 
>             // embedded files are stored in a named tree
>             PDEmbeddedFilesNameTreeNode efTree = document.getDocumentCatalog().getNames().getEmbeddedFiles();
>             Map<String, PDComplexFileSpecification> names = efTree.getNames();
>             LinkedList<String> targets = new LinkedList<String>();
>             targets.addAll(names.keySet());
> 
>             PDComplexFileSpecification fs = names.get("EmptyPagePDF.pdf");
> 
>             PDDocument ef = PDDocument.load(fs.getEmbeddedFile().createInputStream());
>             PDPage page = ef.getPage(0);
>             PDAnnotationLink annotation = new PDAnnotationLink();
>             PDActionEmbeddedGoTo action = new PDActionEmbeddedGoTo();
>             PDTargetDirectory target = new PDTargetDirectory();
>             PDPageDestination dest = new PDPageFitWidthDestination();
> 
>                     String name = new String(targets.get(0));
>                     //byte[] utf16 = new String("EmptyPagePDF.pdf").getBytes("UTF-16");
>                     //name = new String(utf16); // works if this name is used instead
>             target.setFilename(name);
>             target.setRelationship(COSName.C);
>             action.setTargetDirectory(target);
> 
>             //action.setFile(fs);
>             dest.setPage(page);
>             action.setDestination(dest);
>             dest.setPageNumber(0);
>             annotation.setAction(action);
>             annotation.setRectangle(new PDRectangle(10, 10, 100, 100));
> 
>             annotations.add(annotation);
> 
> 
>             // save the PDF
>                 document.save("GoToE link output.pdf");
>                 ef.close();
>         }
>         finally
>         {
>             document.close();
>         }
>     }
> 
> }
> 
> 
> The name of the attachment did not contain any special symbols, but the encoding for
the name of the target directory and the name in the names dictionary were different. The
specification for GoToE actions does require a destination, type, subtype and either a filespec
or (at least for document level attachments) a target directory. As a possible workaround
I tried setting the file specification of the attachment to the action, but that did not work
out either. If I know how GoToE works with file specifications, that would also be enough.
> 
> Dahit
> 
> -----Ursprüngliche Nachricht-----
> Von: Tilman Hausherr [mailto:tilman@apache.org]
> Gesendet: Donnerstag, 18. April 2019 06:15
> An: users@pdfbox.apache.org
> Betreff: [bulk]: Re: Encoding of names dictionary and GoToE target
> 
> Please upload the files without javascript to reduce my fear of opening them. Or better,
post the code you use to create these files so that one can run that code and create the files
including the attachment. I looked at the PDF specification and from what I see with PDFDebugger
your files looks fine. I assume the effect you mention happens with the attached files, despite
that the names were pure ascii ?
> 
> Tilman
> 
> On 2019/04/16 12:47:37, "Gueclue, Dahit" <Dahit.Gueclue@PROSTEP.com> wrote:
> > Hello,
> >
> > I am currently working with PDFBox 2.0.14 and I am trying to create GoToE links
for document attachments. For this I created a test PDF file with one PDF attachment in Adobe
Acrobat XI.
> > The link is a PDAnnotationLink with an PDActionEmbeddedGoTo action. The filename
of the target directory recieves the name extracted from the EmbeddedFiles name dictionary.
After
> > adding the name,  action and rectangle to the annotation, I save the result document
and open it with Adobe Acrobat Reader DC.
> >
> > The problem I have is that the GoToE link is not opening the attachment. After looking
at the file in a text editor, it seems as if the file name in the names dictionary uses a
different encoding,
> > UTF-16-BE, than the file name in the target directory, ISO 8859-1. If I manually
convert file name to UTF-16 before adding it to the target directory it works just fine. However,
if I do not know
> > the encoding of the names in the names dictionary, I cannot generate  the correct
name for the target directory.
> >
> > Is there a way to determine which encoding the names dictionary uses or are there
other workarounds to this? I tried to set the file specification oft he attachment to the
action but that did not
> > work out for me.
> >
> > Attached are the input and output files.
> >
> >
> > Regards,
> >
> > Dahit G�cl�
> >
> >
> > ________________________________________________________________________
> > PROSTEP AG, Dolivostra�e 11, D-64293 Darmstadt
> > HR: Amtsgericht Darmstadt, HRB 8383
> > Vorstand: Dr. Bernd P�tzold (Vorsitz), Reinhard Betz
> > Dr. Karsten Theis
> > Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
> > ________________________________________________________________________
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 
> ________________________________________________________________________
> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt
> HR: Amtsgericht Darmstadt, HRB 8383
> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz
> Dr. Karsten Theis
> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
> ________________________________________________________________________
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message