pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gueclue, Dahit" <Dahit.Guec...@PROSTEP.com>
Subject AW: [bulk]: Re: Encoding of names dictionary and GoToE target
Date Thu, 18 Apr 2019 09:24:26 GMT
Here are the files without javascript. Also I used this code to produce the output:

import java.io.File;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;

import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDEmbeddedFilesNameTreeNode;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.common.filespecification.PDComplexFileSpecification;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionEmbeddedGoTo;
import org.apache.pdfbox.pdmodel.interactive.action.PDTargetDirectory;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination;
import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageFitWidthDestination;


public class AddAnnotations
{
    public static void main(String[] args) throws IOException
    {

        File file = new File("PDF with 1 PDF doc attachment.pdf");
        PDDocument document = PDDocument.load(file);

        PDPage page0 = document.getPage(0);
        List<PDAnnotation> annotations = page0.getAnnotations();

        try
        {

            // embedded files are stored in a named tree
            PDEmbeddedFilesNameTreeNode efTree = document.getDocumentCatalog().getNames().getEmbeddedFiles();
            Map<String, PDComplexFileSpecification> names = efTree.getNames();
            LinkedList<String> targets = new LinkedList<String>();
            targets.addAll(names.keySet());

            PDComplexFileSpecification fs = names.get("EmptyPagePDF.pdf");

            PDDocument ef = PDDocument.load(fs.getEmbeddedFile().createInputStream());
            PDPage page = ef.getPage(0);
            PDAnnotationLink annotation = new PDAnnotationLink();
            PDActionEmbeddedGoTo action = new PDActionEmbeddedGoTo();
            PDTargetDirectory target = new PDTargetDirectory();
            PDPageDestination dest = new PDPageFitWidthDestination();

                    String name = new String(targets.get(0));
                    //byte[] utf16 = new String("EmptyPagePDF.pdf").getBytes("UTF-16");
                    //name = new String(utf16); // works if this name is used instead
            target.setFilename(name);
            target.setRelationship(COSName.C);
            action.setTargetDirectory(target);

            //action.setFile(fs);
            dest.setPage(page);
            action.setDestination(dest);
            dest.setPageNumber(0);
            annotation.setAction(action);
            annotation.setRectangle(new PDRectangle(10, 10, 100, 100));

            annotations.add(annotation);


            // save the PDF
                document.save("GoToE link output.pdf");
                ef.close();
        }
        finally
        {
            document.close();
        }
    }

}


The name of the attachment did not contain any special symbols, but the encoding for the name
of the target directory and the name in the names dictionary were different. The specification
for GoToE actions does require a destination, type, subtype and either a filespec or (at least
for document level attachments) a target directory. As a possible workaround I tried setting
the file specification of the attachment to the action, but that did not work out either.
If I know how GoToE works with file specifications, that would also be enough.

Dahit

-----Ursprüngliche Nachricht-----
Von: Tilman Hausherr [mailto:tilman@apache.org]
Gesendet: Donnerstag, 18. April 2019 06:15
An: users@pdfbox.apache.org
Betreff: [bulk]: Re: Encoding of names dictionary and GoToE target

Please upload the files without javascript to reduce my fear of opening them. Or better, post
the code you use to create these files so that one can run that code and create the files
including the attachment. I looked at the PDF specification and from what I see with PDFDebugger
your files looks fine. I assume the effect you mention happens with the attached files, despite
that the names were pure ascii ?

Tilman

On 2019/04/16 12:47:37, "Gueclue, Dahit" <Dahit.Gueclue@PROSTEP.com> wrote:
> Hello,
>
> I am currently working with PDFBox 2.0.14 and I am trying to create GoToE links for document
attachments. For this I created a test PDF file with one PDF attachment in Adobe Acrobat XI.
> The link is a PDAnnotationLink with an PDActionEmbeddedGoTo action. The filename of the
target directory recieves the name extracted from the EmbeddedFiles name dictionary. After
> adding the name,  action and rectangle to the annotation, I save the result document
and open it with Adobe Acrobat Reader DC.
>
> The problem I have is that the GoToE link is not opening the attachment. After looking
at the file in a text editor, it seems as if the file name in the names dictionary uses a
different encoding,
> UTF-16-BE, than the file name in the target directory, ISO 8859-1. If I manually convert
file name to UTF-16 before adding it to the target directory it works just fine. However,
if I do not know
> the encoding of the names in the names dictionary, I cannot generate  the correct name
for the target directory.
>
> Is there a way to determine which encoding the names dictionary uses or are there other
workarounds to this? I tried to set the file specification oft he attachment to the action
but that did not
> work out for me.
>
> Attached are the input and output files.
>
>
> Regards,
>
> Dahit G�cl�
>
>
> ________________________________________________________________________
> PROSTEP AG, Dolivostra�e 11, D-64293 Darmstadt
> HR: Amtsgericht Darmstadt, HRB 8383
> Vorstand: Dr. Bernd P�tzold (Vorsitz), Reinhard Betz
> Dr. Karsten Theis
> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
> ________________________________________________________________________
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

________________________________________________________________________
PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt
HR: Amtsgericht Darmstadt, HRB 8383
Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz
Dr. Karsten Theis
Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
________________________________________________________________________
Mime
View raw message