Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 03E83100C2 for ; Sat, 9 Nov 2013 13:19:26 +0000 (UTC) Received: (qmail 77719 invoked by uid 500); 9 Nov 2013 13:19:25 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 77505 invoked by uid 500); 9 Nov 2013 13:19:21 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 76588 invoked by uid 99); 9 Nov 2013 13:19:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Nov 2013 13:19:20 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gilad.denneboom@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bk0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Nov 2013 13:19:14 +0000 Received: by mail-bk0-f44.google.com with SMTP id mx12so179969bkb.17 for ; Sat, 09 Nov 2013 05:18:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=N+yDfNT3n2jyCj8UrH9sI9bL0QEuzYWTHGtmOQyGglE=; b=nndXvHC0BPs3sfgrPQwLAw8KXem0ur+NheIMmWSx8utDz6c0USS7aP7DZX2iygAuFs drSm2JjNXJMsy2yxIx/s8i/EjsKrJffFglShnwawiFbNr995Il1eYOkRGPcU8P7GAdvo v1U+rwF8gjvTx8s/7EQUQC1qh79IxZ45msVskNlY5WL30Dy56hCODRtjkNPPtAcsAJA2 x29kwEHEyF2CxdThKnQg6pWj8LEcxrOU5rcX5Gd0xPsw0jxXkEXJWY95ihwnAe61QP59 6FJay+Ip3ewVcqFYhDrqY6XHYS5gs53/rcHji7YpW6+LFDEja78VNIpPqV4da1pqO8/f rKgA== X-Received: by 10.204.168.132 with SMTP id u4mr4941018bky.28.1384003133255; Sat, 09 Nov 2013 05:18:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.248.65 with HTTP; Sat, 9 Nov 2013 05:18:33 -0800 (PST) In-Reply-To: References: From: Gilad Denneboom Date: Sat, 9 Nov 2013 14:18:33 +0100 Message-ID: Subject: Re: bookmark.getDestination is null To: "users@pdfbox.apache.org" Content-Type: multipart/alternative; boundary=bcaec52c665720621904eabe57f8 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec52c665720621904eabe57f8 Content-Type: text/plain; charset=ISO-8859-1 I wrote this code for you to do this task in a different way: PDDocument doc = PDDocument.load("c:/temp/2013_DA_Schmitz.pdf"); PDPage p = doc.getDocumentCatalog().getDocumentOutline().getFirstChild().findDestinationPage(doc); List pages = doc.getDocumentCatalog().getAllPages(); for (int i=0; i wrote: > My main goal is to extract the chapternames, the pagecount of each chapter > and a way to see, if something new was written in the chapter. > Further, I want to extract the bullet points inside the PDF, but thats not > so relevant. I've got the chapternames out of PDFBox. So that works. > The "see if somethings new" I wanted to make with counting the characters. > > > Am 09.11.2013, 10:53 Uhr, schrieb Gilad Denneboom < > gilad.denneboom@gmail.com>: > > > Is writing the code a part of your thesis, or extracting the "chapters"? >> If >> the latter, have you considered doing it with JavaScript in Acrobat >> instead >> of using Java? >> >> >> On Sat, Nov 9, 2013 at 10:00 AM, Sera wrote: >> >> https://www2.swc.rwth-aachen.de/docs/2013_DA_Schmitz.pdf >>> >>> This would be a sample. It was made with LateX and consists of more than >>> one .tex file. >>> >>> Hope it can help. It's for my bachelor thesis and otherwise I'm lost :( >>> >>> BR >>> Sera >>> >>> >>> Am 09.11.2013, 09:32 Uhr, schrieb Maruan Sahyoun >> >: >>> >>> >>> Hi Sera, >>> >>>> >>>> if the bookmarks do nor relate to pages they can not be taken as a hint >>>> for splitting. >>>> >>>> Is it possible to upload a sample PDF at a public location so we can >>>> take >>>> a look at a sample file. Might give us another idea to handle your >>>> requirement. >>>> >>>> BR >>>> Maruan >>>> >>>> Am 08.11.2013 um 17:24 schrieb Sera : >>>> >>>> Well then. I've got another idea. >>>> >>>>> Actually, I don't need the exakt pagenumber, but the pagecount of each >>>>> chapter. >>>>> Is it still possible to devide the PDF by it's bookmarks or would'nt >>>>> that work as well? >>>>> When I've devided them, I can just make doc.getNumberOfPages(). That >>>>> works here. >>>>> >>>>> Am 08.11.2013, 17:19 Uhr, schrieb Gilad Denneboom < >>>>> gilad.denneboom@gmail.com>: >>>>> >>>>> Yes, that could very well be the cause... >>>>> >>>>>> >>>>>> >>>>>> On Fri, Nov 8, 2013 at 4:51 PM, Sera wrote: >>>>>> >>>>>> Could it be a problem of latex? >>>>>> >>>>>>> I'm using it do generate the pdf. >>>>>>> >>>>>>> Am 08.11.2013, 16:40 Uhr, schrieb Sera : >>>>>>> >>>>>>> >>>>>>> First, thanks for the code! >>>>>>> >>>>>>> Unfurtanately, I still get a Nullpointer. >>>>>>>> dests.getNames() is null. >>>>>>>> >>>>>>>> Am 04.11.2013, 13:38 Uhr, schrieb Gilad Denneboom < >>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>> >>>>>>>> You wrote the following code to do it: >>>>>>>> >>>>>>>> >>>>>>>>> public static int getPageNumberFromNamedDestination(PDDocument >>>>>>>>> doc, >>>>>>>>> String name) throws IOException { >>>>>>>>> PDDestinationNameTreeNode dests = >>>>>>>>> doc.getDocumentCatalog().getNames().getDests(); >>>>>>>>> if (dests==null || dests.getNames()==null) >>>>>>>>> return -1; >>>>>>>>> Object d = dests.getNames().get(name); >>>>>>>>> if (d==null) >>>>>>>>> return -1; >>>>>>>>> return getPageDestPageNumber(d); >>>>>>>>> } >>>>>>>>> >>>>>>>>> public static int getPageDestPageNumber(Object dest) { >>>>>>>>> >>>>>>>>> if (dest instanceof PDPageFitDestination) { >>>>>>>>> PDPageFitDestination pageFitDestination = >>>>>>>>> (PDPageFitDestination) dest; >>>>>>>>> return pageFitDestination.findPageNumber(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (dest instanceof PDPageXYZDestination) { >>>>>>>>> PDPageXYZDestination pageXYZDestination = >>>>>>>>> (PDPageXYZDestination) dest; >>>>>>>>> return pageXYZDestination.findPageNumber(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (dest instanceof PDPageFitWidthDestination) { >>>>>>>>> PDPageFitWidthDestination fitWidthDestination = >>>>>>>>> (PDPageFitWidthDestination) dest; >>>>>>>>> return fitWidthDestination.findPageNumber(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (dest instanceof PDPageFitHeightDestination) { >>>>>>>>> PDPageFitHeightDestination fitHeightDestination = >>>>>>>>> (PDPageFitHeightDestination) dest; >>>>>>>>> return fitHeightDestination.findPageNumber(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> if (dest instanceof PDPageFitRectangleDestination) { >>>>>>>>> PDPageFitRectangleDestination >>>>>>>>> pageFitRectangleDestination >>>>>>>>> = >>>>>>>>> (PDPageFitRectangleDestination) dest; >>>>>>>>> return pageFitRectangleDestination.findPageNumber(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> return -1; >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sun, Nov 3, 2013 at 1:39 PM, Sera wrote: >>>>>>>>> >>>>>>>>> I've debugged it and it throws an exception. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> PDDestinationNameTreeNode node = (PDDestinationNameTreeNode) >>>>>>>>>> document.getDocumentCatalog().getStructureTreeRoot().getIDTree(); >>>>>>>>>> >>>>>>>>>> any idea what the correct way is? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Am 01.11.2013, 23:47 Uhr, schrieb Sera : >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> is this the right way to get to the treenode? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Am 31.10.2013, 11:28 Uhr, schrieb Gilad Denneboom < >>>>>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>>>>> >>>>>>>>>>> If the destination is a PDNamedDestination object, you have to >>>>>>>>>>> cast >>>>>>>>>>> it to >>>>>>>>>>> >>>>>>>>>>> that class... >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Oct 31, 2013 at 11:24 AM, Sera >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Do I have to cast Action to another type than ActionGoTo? I >>>>>>>>>>>> don't >>>>>>>>>>>> see a >>>>>>>>>>>> >>>>>>>>>>>> function getNamedDestination() in the suggestions for my >>>>>>>>>>>> objects. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Am 31.10.2013, 10:45 Uhr, schrieb Gilad Denneboom < >>>>>>>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>>>>>>> >>>>>>>>>>>>> Ah, so your bookmarks are not pointing to page locations >>>>>>>>>>>>> directly, >>>>>>>>>>>>> but >>>>>>>>>>>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> Named Destinations. This makes things more complex. You can use >>>>>>>>>>>>> >>>>>>>>>>>>> getNamedDestination() to get the name of the ND the bookmark >>>>>>>>>>>>>> is >>>>>>>>>>>>>> pointing >>>>>>>>>>>>>> to. Of course, then you still need to write a function that >>>>>>>>>>>>>> looks up >>>>>>>>>>>>>> that >>>>>>>>>>>>>> specific ND in the tree (a PDDestinationNameTreeNode object) >>>>>>>>>>>>>> and >>>>>>>>>>>>>> then >>>>>>>>>>>>>> figures out to which page it's pointing to by its value. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Oct 31, 2013 at 10:35 AM, Sera >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> when i make it toString() i get: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> org.apache.pdfbox.pdmodel.****interactive.**** >>>>>>>>>>>>>> >>>>>>>>>>>>>>> documentnavigation.** >>>>>>>>>>>>>>> destination.****PDNamedDestination@505484dc >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> whereas the last after @ is always different. I think its the >>>>>>>>>>>>>>> hashed >>>>>>>>>>>>>>> destination? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Am 31.10.2013, 10:20 Uhr, schrieb Gilad Denneboom < >>>>>>>>>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What do you mean by "hascode", exactly? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Oct 31, 2013 at 10:16 AM, Sera >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ok, now I've got the destination as a hashcode. How do I get >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> pagenumber from this? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am 30.10.2013, 20:10 Uhr, schrieb Gilad Denneboom < >>>>>>>>>>>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Like I said, you need to determine (using instanceof, for >>>>>>>>>>>>>>>>> example) >>>>>>>>>>>>>>>>> which >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> actual class it is, one of the subsets of PDAction, like >>>>>>>>>>>>>>>>> PDActionGoTo >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Oct 30, 2013 at 7:51 PM, Sera >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> current.getAction() is just a PDAction. From there I don't >>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>> access >>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> getDestination(). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Am 30.10.2013, 16:27 Uhr, schrieb Gilad Denneboom < >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> gilad.denneboom@gmail.com>: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> You should get the Action of the bookmark, and then check >>>>>>>>>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>> type >>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> action it is (probably PDActionGoTo), and from the Action >>>>>>>>>>>>>>>>>>> you'll >>>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> access to the Destination. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Wed, Oct 30, 2013 at 4:00 PM, Sera >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hello! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I need to extract the pagenumber out of the bookmarks >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> tried >>>>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PDOutlineItem current = bookmark.getFirstChild(); >>>>>>>>>>>>>>>>>>>>> PDDestination destination = null; >>>>>>>>>>>>>>>>>>>>> destination = current.getDestination(); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> But the destination stays null. Any ideas on how to fix >>>>>>>>>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>>>>> Sera >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: >>>>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: >>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://www.opera.com/mail/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: >>>>>>>>>>>>>> http://www.opera.com/mail/ >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/ >>>>>>> >>>>>>> >>>>>>> >>>>> -- >>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/ >>>>> >>>>> >>>> >>>> >>> -- >>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/ >>> >>> > > -- > Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/ > --bcaec52c665720621904eabe57f8--