From users-return-11679-archive-asf-public=cust-asf.ponee.io@pdfbox.apache.org Tue Apr 23 12:59:55 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 70E8A180621 for ; Tue, 23 Apr 2019 14:59:55 +0200 (CEST) Received: (qmail 73243 invoked by uid 500); 23 Apr 2019 12:59:54 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 73228 invoked by uid 99); 23 Apr 2019 12:59:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Apr 2019 12:59:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D2532C6E8A for ; Tue, 23 Apr 2019 12:59:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.801 X-Spam-Level: * X-Spam-Status: No, score=1.801 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 4ybDX6ym3nl0 for ; Tue, 23 Apr 2019 12:59:49 +0000 (UTC) Received: from mailout09.t-online.de (mailout09.t-online.de [194.25.134.84]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 46A495F6BB for ; Tue, 23 Apr 2019 12:59:49 +0000 (UTC) Received: from fwd40.aul.t-online.de (fwd40.aul.t-online.de [172.20.26.139]) by mailout09.t-online.de (Postfix) with SMTP id 279104221E01 for ; Tue, 23 Apr 2019 14:59:41 +0200 (CEST) Received: from [192.168.2.111] (TQLxDUZC8hZ8aSB+HkmOH86Tc0Q3LGppugY1g2FKHxNe2KeAYvzr3C1cNlA2uvNZEC@[84.151.181.98]) by fwd40.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1hIv1P-1N80jA0; Tue, 23 Apr 2019 14:59:39 +0200 Subject: Re: Encoding of names dictionary and GoToE target To: users@pdfbox.apache.org References: <2730fdca-efe8-d2f6-a129-1741a42a7177@t-online.de> From: Tilman Hausherr Message-ID: <1455b6c9-9a32-a786-625d-9d180bf78139@t-online.de> Date: Tue, 23 Apr 2019 14:59:38 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-ID: TQLxDUZC8hZ8aSB+HkmOH86Tc0Q3LGppugY1g2FKHxNe2KeAYvzr3C1cNlA2uvNZEC X-TOI-MSGID: 1dcbc160-a417-4dbf-8ff2-a82819579623 Sorry, I have no idea. I don't use this myself. I could only tell you what's in the PDF specification. Tilman Am 23.04.2019 um 14:49 schrieb Gueclue, Dahit: > Ah thank you very much, using the COSStrings I finally got this to work. I am curious though, the PDF specification states that GoToE actions can work if you set the S, F and D entry of its dictionary, the T entry would then be optional. I thought by setting the PDComplexFileSpecification of the embedded file with PDActionEmbeddedGoTo.setFile() the link would work. Is the target directory always required? > > Dahit > > -----Ursprüngliche Nachricht----- > Von: Tilman Hausherr [mailto:THausherr@t-online.de] > Gesendet: Dienstag, 23. April 2019 11:57 > An: users@pdfbox.apache.org > Betreff: [bulk]: Re: [bulk]: Re: AW: [bulk]: Re: Encoding of names dictionary and GoToE target > > You'd have to write your own PDNameTreeNode.getNames(), that doesn't > convert the COSString to a String. Or better, your own utility method > for the name tree that creates a COSString-keyed map of all, not just > one level. Then analyse the COSString s that you get. > > Tilman > > Am 23.04.2019 um 10:38 schrieb Gueclue, Dahit: >> Yes, that's what I found as well. Is there a way to to know this using Java with PDFBox? If I do not know the encoding in advance, I cannot choose the right encoding for the target directory without making assumptions. In my experience the encodings had to match for the GoToE link to work. >> >> Dahit >> >> -----Ursprüngliche Nachricht----- >> Von: Tilman Hausherr [mailto:tilman@apache.org] >> Gesendet: Freitag, 19. April 2019 07:32 >> An: users@pdfbox.apache.org >> Betreff: [bulk]: Re: AW: [bulk]: Re: Encoding of names dictionary and GoToE target >> >> I meant upload to a sharehoster (attachments are deleted, except when stuck in moderation and your second one wasn't), but never mind. I removed the JavaScript programmatically and found this: >> >> /Names [ 15 0 R] >> >> So the UTF16 is in the original file. >> >> Tilman >> >> >> On 2019/04/18 09:24:26, "Gueclue, Dahit" wrote: >>> Here are the files without javascript. Also I used this code to produce the output: >>> >>> import java.io.File; >>> import java.io.IOException; >>> import java.util.LinkedList; >>> import java.util.List; >>> import java.util.Map; >>> >>> import org.apache.pdfbox.cos.COSName; >>> import org.apache.pdfbox.pdmodel.PDDocument; >>> import org.apache.pdfbox.pdmodel.PDEmbeddedFilesNameTreeNode; >>> import org.apache.pdfbox.pdmodel.PDPage; >>> import org.apache.pdfbox.pdmodel.common.PDRectangle; >>> import org.apache.pdfbox.pdmodel.common.filespecification.PDComplexFileSpecification; >>> import org.apache.pdfbox.pdmodel.interactive.action.PDActionEmbeddedGoTo; >>> import org.apache.pdfbox.pdmodel.interactive.action.PDTargetDirectory; >>> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation; >>> import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink; >>> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination; >>> import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageFitWidthDestination; >>> >>> >>> public class AddAnnotations >>> { >>> public static void main(String[] args) throws IOException >>> { >>> >>> File file = new File("PDF with 1 PDF doc attachment.pdf"); >>> PDDocument document = PDDocument.load(file); >>> >>> PDPage page0 = document.getPage(0); >>> List annotations = page0.getAnnotations(); >>> >>> try >>> { >>> >>> // embedded files are stored in a named tree >>> PDEmbeddedFilesNameTreeNode efTree = document.getDocumentCatalog().getNames().getEmbeddedFiles(); >>> Map names = efTree.getNames(); >>> LinkedList targets = new LinkedList(); >>> targets.addAll(names.keySet()); >>> >>> PDComplexFileSpecification fs = names.get("EmptyPagePDF.pdf"); >>> >>> PDDocument ef = PDDocument.load(fs.getEmbeddedFile().createInputStream()); >>> PDPage page = ef.getPage(0); >>> PDAnnotationLink annotation = new PDAnnotationLink(); >>> PDActionEmbeddedGoTo action = new PDActionEmbeddedGoTo(); >>> PDTargetDirectory target = new PDTargetDirectory(); >>> PDPageDestination dest = new PDPageFitWidthDestination(); >>> >>> String name = new String(targets.get(0)); >>> //byte[] utf16 = new String("EmptyPagePDF.pdf").getBytes("UTF-16"); >>> //name = new String(utf16); // works if this name is used instead >>> target.setFilename(name); >>> target.setRelationship(COSName.C); >>> action.setTargetDirectory(target); >>> >>> //action.setFile(fs); >>> dest.setPage(page); >>> action.setDestination(dest); >>> dest.setPageNumber(0); >>> annotation.setAction(action); >>> annotation.setRectangle(new PDRectangle(10, 10, 100, 100)); >>> >>> annotations.add(annotation); >>> >>> >>> // save the PDF >>> document.save("GoToE link output.pdf"); >>> ef.close(); >>> } >>> finally >>> { >>> document.close(); >>> } >>> } >>> >>> } >>> >>> >>> The name of the attachment did not contain any special symbols, but the encoding for the name of the target directory and the name in the names dictionary were different. The specification for GoToE actions does require a destination, type, subtype and either a filespec or (at least for document level attachments) a target directory. As a possible workaround I tried setting the file specification of the attachment to the action, but that did not work out either. If I know how GoToE works with file specifications, that would also be enough. >>> >>> Dahit >>> >>> -----Ursprüngliche Nachricht----- >>> Von: Tilman Hausherr [mailto:tilman@apache.org] >>> Gesendet: Donnerstag, 18. April 2019 06:15 >>> An: users@pdfbox.apache.org >>> Betreff: [bulk]: Re: Encoding of names dictionary and GoToE target >>> >>> Please upload the files without javascript to reduce my fear of opening them. Or better, post the code you use to create these files so that one can run that code and create the files including the attachment. I looked at the PDF specification and from what I see with PDFDebugger your files looks fine. I assume the effect you mention happens with the attached files, despite that the names were pure ascii ? >>> >>> Tilman >>> >>> On 2019/04/16 12:47:37, "Gueclue, Dahit" wrote: >>>> Hello, >>>> >>>> I am currently working with PDFBox 2.0.14 and I am trying to create GoToE links for document attachments. For this I created a test PDF file with one PDF attachment in Adobe Acrobat XI. >>>> The link is a PDAnnotationLink with an PDActionEmbeddedGoTo action. The filename of the target directory recieves the name extracted from the EmbeddedFiles name dictionary. After >>>> adding the name, action and rectangle to the annotation, I save the result document and open it with Adobe Acrobat Reader DC. >>>> >>>> The problem I have is that the GoToE link is not opening the attachment. After looking at the file in a text editor, it seems as if the file name in the names dictionary uses a different encoding, >>>> UTF-16-BE, than the file name in the target directory, ISO 8859-1. If I manually convert file name to UTF-16 before adding it to the target directory it works just fine. However, if I do not know >>>> the encoding of the names in the names dictionary, I cannot generate the correct name for the target directory. >>>> >>>> Is there a way to determine which encoding the names dictionary uses or are there other workarounds to this? I tried to set the file specification oft he attachment to the action but that did not >>>> work out for me. >>>> >>>> Attached are the input and output files. >>>> >>>> >>>> Regards, >>>> >>>> Dahit G�cl� >>>> >>>> >>>> ________________________________________________________________________ >>>> PROSTEP AG, Dolivostra�e 11, D-64293 Darmstadt >>>> HR: Amtsgericht Darmstadt, HRB 8383 >>>> Vorstand: Dr. Bernd P�tzold (Vorsitz), Reinhard Betz >>>> Dr. Karsten Theis >>>> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz) >>>> ________________________________________________________________________ >>>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org >>> For additional commands, e-mail: users-help@pdfbox.apache.org >>> >>> ________________________________________________________________________ >>> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt >>> HR: Amtsgericht Darmstadt, HRB 8383 >>> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz >>> Dr. Karsten Theis >>> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz) >>> ________________________________________________________________________ >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org >> For additional commands, e-mail: users-help@pdfbox.apache.org >> >> ________________________________________________________________________ >> PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt >> HR: Amtsgericht Darmstadt, HRB 8383 >> Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz >> Dr. Karsten Theis >> Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz) >> ________________________________________________________________________ >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org >> For additional commands, e-mail: users-help@pdfbox.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org > For additional commands, e-mail: users-help@pdfbox.apache.org > > ________________________________________________________________________ > PROSTEP AG, Dolivostraße 11, D-64293 Darmstadt > HR: Amtsgericht Darmstadt, HRB 8383 > Vorstand: Dr. Bernd Pätzold (Vorsitz), Reinhard Betz > Dr. Karsten Theis > Aufsichtsrat: Dr. Heinz-Gerd Lehnhoff (Vorsitz) > ________________________________________________________________________ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org > For additional commands, e-mail: users-help@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org