pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Encoding of names dictionary and GoToE target
Date Tue, 23 Apr 2019 12:59:38 GMT
Sorry, I have no idea. I don't use this myself. I could only tell you 
what's in the PDF specification.

Tilman

Am 23.04.2019 um 14:49 schrieb Gueclue, Dahit:
> Ah thank you very much, using the COSStrings I finally got this to work. I am curious
though, the PDF specification states that GoToE actions can work if you set the S, F and D
entry of its dictionary, the T entry would then be optional. I thought by setting the PDComplexFileSpecification
of the embedded file with PDActionEmbeddedGoTo.setFile() the link would work. Is the target
directory always required?
>
> Dahit
>
> -----Ursprüngliche Nachricht-----
> Von: Tilman Hausherr [mailto:THausherr@t-online.de]
> Gesendet: Dienstag, 23. April 2019 11:57
> An: users@pdfbox.apache.org
> Betreff: [bulk]: Re: [bulk]: Re: AW: [bulk]: Re: Encoding of names dictionary and GoToE
target
>
> You'd have to write your own PDNameTreeNode.getNames(), that doesn't
> convert the COSString to a String. Or better, your own utility method
> for the name tree that creates a COSString-keyed map of all, not just
> one level. Then analyse the COSString s that you get.
>
> Tilman
>
> Am 23.04.2019 um 10:38 schrieb Gueclue, Dahit:
>> Yes, that's what I found as well. Is there a way to to know this using Java with
PDFBox? If I do not know the encoding in advance, I cannot choose the right encoding for the
target directory without making assumptions. In my experience the encodings had to match for
the GoToE link to work.
>>
>> Dahit
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Tilman Hausherr [mailto:tilman@apache.org]
>> Gesendet: Freitag, 19. April 2019 07:32
>> An: users@pdfbox.apache.org
>> Betreff: [bulk]: Re: AW: [bulk]: Re: Encoding of names dictionary and GoToE target
>>
>> I meant upload to a sharehoster (attachments are deleted, except when stuck in moderation
and your second one wasn't), but never mind. I removed the JavaScript programmatically and
found this:
>>
>> /Names [<FEFF0045006D0070007400790050006100670065005000440046002E007000640066>
15 0 R]
>>
>> So the UTF16 is in the original file.
>>
>> Tilman
>>
>>
>> On 2019/04/18 09:24:26, "Gueclue, Dahit" <Dahit.Gueclue@PROSTEP.com> wrote:
>>> Here are the files without javascript. Also I used this code to produce the output:
>>>
>>> import java.io.File;
>>> import java.io.IOException;
>>> import java.util.LinkedList;
>>> import java.util.List;
>>> import java.util.Map;
>>>
>>> import org.apache.pdfbox.cos.COSName;
>>> import org.apache.pdfbox.pdmodel.PDDocument;
>>> import org.apache.pdfbox.pdmodel.PDEmbeddedFilesNameTreeNode;
>>> import org.apache.pdfbox.pdmodel.PDPage;
>>> import org.apache.pdfbox.pdmodel.common.PDRectangle;
>>> import org.apache.pdfbox.pdmodel.common.filespecification.PDComplexFileSpecification;
>>> import org.apache.pdfbox.pdmodel.interactive.action.PDActionEmbeddedGoTo;
>>> import org.apache.pdfbox.pdmodel.interactive.action.PDTargetDirectory;
>>> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
>>> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
>>> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination;
>>> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageFitWidthDestination;
>>>
>>>
>>> public class AddAnnotations
>>> {
>>>       public static void main(String[] args) throws IOException
>>>       {
>>>
>>>           File file = new File("PDF with 1 PDF doc attachment.pdf");
>>>           PDDocument document = PDDocument.load(file);
>>>
>>>           PDPage page0 = document.getPage(0);
>>>           List<PDAnnotation> annotations = page0.getAnnotations();
>>>
>>>           try
>>>           {
>>>
>>>               // embedded files are stored in a named tree
>>>               PDEmbeddedFilesNameTreeNode efTree = document.getDocumentCatalog().getNames().getEmbeddedFiles();
>>>               Map<String, PDComplexFileSpecification> names = efTree.getNames();
>>>               LinkedList<String> targets = new LinkedList<String>();
>>>               targets.addAll(names.keySet());
>>>
>>>               PDComplexFileSpecification fs = names.get("EmptyPagePDF.pdf");
>>>
>>>               PDDocument ef = PDDocument.load(fs.getEmbeddedFile().createInputStream());
>>>               PDPage page = ef.getPage(0);
>>>               PDAnnotationLink annotation = new PDAnnotationLink();
>>>               PDActionEmbeddedGoTo action = new PDActionEmbeddedGoTo();
>>>               PDTargetDirectory target = new PDTargetDirectory();
>>>               PDPageDestination dest = new PDPageFitWidthDestination();
>>>
>>>                       String name = new String(targets.get(0));
>>>                       //byte[] utf16 = new String("EmptyPagePDF.pdf").getBytes("UTF-16");
>>>                       //name = new String(utf16); // works if this name is used
instead
>>>               target.setFilename(name);
>>>               target.setRelationship(COSName.C);
>>>               action.setTargetDirectory(target);
>>>
>>>               //action.setFile(fs);
>>>               dest.setPage(page);
>>>               action.setDestination(dest);
>>>               dest.setPageNumber(0);
>>>               annotation.setAction(action);
>>>               annotation.setRectangle(new PDRectangle(10, 10, 100, 100));
>>>
>>>               annotations.add(annotation);
>>>
>>>
>>>               // save the PDF
>>>                   document.save("GoToE link output.pdf");
>>>                   ef.close();
>>>           }
>>>           finally
>>>           {
>>>               document.close();
>>>           }
>>>       }
>>>
>>> }
>>>
>>>
>>> The name of the attachment did not contain any special symbols, but the encoding
for the name of the target directory and the name in the names dictionary were different.
The specification for GoToE actions does require a destination, type, subtype and either a
filespec or (at least for document level attachments) a target directory. As a possible workaround
I tried setting the file specification of the attachment to the action, but that did not work
out either. If I know how GoToE works with file specifications, that would also be enough.
>>>
>>> Dahit
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Tilman Hausherr [mailto:tilman@apache.org]
>>> Gesendet: Donnerstag, 18. April 2019 06:15
>>> An: users@pdfbox.apache.org
>>> Betreff: [bulk]: Re: Encoding of names dictionary and GoToE target
>>>
>>> Please upload the files without javascript to reduce my fear of opening them.
Or better, post the code you use to create these files so that one can run that code and create
the files including the attachment. I looked at the PDF specification and from what I see
with PDFDebugger your files looks fine. I assume the effect you mention happens with the attached
files, despite that the names were pure ascii ?
>>>
>>> Tilman
>>>
>>> On 2019/04/16 12:47:37, "Gueclue, Dahit" <Dahit.Gueclue@PROSTEP.com> wrote:
>>>> Hello,
>>>>
>>>> I am currently working with PDFBox 2.0.14 and I am trying to create GoToE
links for document attachments. For this I created a test PDF file with one PDF attachment
in Adobe Acrobat XI.
>>>> The link is a PDAnnotationLink with an PDActionEmbeddedGoTo action. The filename
of the target directory recieves the name extracted from the EmbeddedFiles name dictionary.
After
>>>> adding the name,  action and rectangle to the annotation, I save the result
document and open it with Adobe Acrobat Reader DC.
>>>>
>>>> The problem I have is that the GoToE link is not opening the attachment.
After looking at the file in a text editor, it seems as if the file name in the names dictionary
uses a different encoding,
>>>> UTF-16-BE, than the file name in the target directory, ISO 8859-1. If I manually
convert file name to UTF-16 before adding it to the target directory it works just fine. However,
if I do not know
>>>> the encoding of the names in the names dictionary, I cannot generate  the
correct name for the target directory.
>>>>
>>>> Is there a way to determine which encoding the names dictionary uses or are
there other workarounds to this? I tried to set the file specification oft he attachment to
the action but that did not
>>>> work out for me.
>>>>
>>>> Attached are the input and output files.
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Dahit G�cl�
>>>>
>>>>
>>>> ________________________________________________________________________
>>>> PROSTEP AG, Dolivostra�e 11, D-64293 Darmstadt
>>>> HR: Amtsgericht Darmstadt, HRB 8383
>>>> Vorstand: Dr. Bernd P�tzold (Vorsitz), Reinhard Betz
>>>> Dr. Karsten Theis
>>>> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
>>>> ________________________________________________________________________
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>> ________________________________________________________________________
>>> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt
>>> HR: Amtsgericht Darmstadt, HRB 8383
>>> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz
>>> Dr. Karsten Theis
>>> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
>>> ________________________________________________________________________
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>> ________________________________________________________________________
>> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt
>> HR: Amtsgericht Darmstadt, HRB 8383
>> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz
>> Dr. Karsten Theis
>> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
>> ________________________________________________________________________
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
> ________________________________________________________________________
> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt
> HR: Amtsgericht Darmstadt, HRB 8383
> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz
> Dr. Karsten Theis
> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz)
> ________________________________________________________________________
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message