pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilad Denneboom <gilad.denneb...@gmail.com>
Subject Re: bookmark.getDestination is null
Date Sat, 09 Nov 2013 13:18:33 GMT
I wrote this code for you to do this task in a different way:

        PDDocument doc = PDDocument.load("c:/temp/2013_DA_Schmitz.pdf");
        PDPage p =
doc.getDocumentCatalog().getDocumentOutline().getFirstChild().findDestinationPage(doc);
        List<PDPage> pages = doc.getDocumentCatalog().getAllPages();
        for (int i=0; i<pages.size(); i++)
            if (pages.get(i).equals(p))
                System.out.println(i);
        doc.close();

This looks up the page number of the first bookmark in the file, and it
returns 14 (remember it's 0-based).

Gilad


On Sat, Nov 9, 2013 at 12:51 PM, Sera <news4sera@gmx.de> wrote:

> My main goal is to extract the chapternames, the pagecount of each chapter
> and a way to see, if something new was written in the chapter.
> Further, I want to extract the bullet points inside the PDF, but thats not
> so relevant. I've got the chapternames out of PDFBox. So that works.
> The "see if somethings new" I wanted to make with counting the characters.
>
>
> Am 09.11.2013, 10:53 Uhr, schrieb Gilad Denneboom <
> gilad.denneboom@gmail.com>:
>
>
>  Is writing the code a part of your thesis, or extracting the "chapters"?
>> If
>> the latter, have you considered doing it with JavaScript in Acrobat
>> instead
>> of using Java?
>>
>>
>> On Sat, Nov 9, 2013 at 10:00 AM, Sera <news4sera@gmx.de> wrote:
>>
>>  https://www2.swc.rwth-aachen.de/docs/2013_DA_Schmitz.pdf
>>>
>>> This would be a sample. It was made with LateX and consists of more than
>>> one .tex file.
>>>
>>> Hope it can help. It's for my bachelor thesis and otherwise I'm lost :(
>>>
>>> BR
>>> Sera
>>>
>>>
>>> Am 09.11.2013, 09:32 Uhr, schrieb Maruan Sahyoun <sahyoun@fileaffairs.de
>>> >:
>>>
>>>
>>>  Hi Sera,
>>>
>>>>
>>>> if the bookmarks do nor relate to pages they can not be taken as a hint
>>>> for splitting.
>>>>
>>>> Is it possible to upload a sample PDF at a public location so we can
>>>> take
>>>> a look at a sample file. Might give us another idea to handle your
>>>> requirement.
>>>>
>>>> BR
>>>> Maruan
>>>>
>>>> Am 08.11.2013 um 17:24 schrieb Sera <news4sera@gmx.de>:
>>>>
>>>>  Well then. I've got another idea.
>>>>
>>>>> Actually, I don't need the exakt pagenumber, but the pagecount of each
>>>>> chapter.
>>>>> Is it still possible to devide the PDF by it's bookmarks or would'nt
>>>>> that work as well?
>>>>> When I've devided them, I can just make doc.getNumberOfPages(). That
>>>>> works here.
>>>>>
>>>>> Am 08.11.2013, 17:19 Uhr, schrieb Gilad Denneboom <
>>>>> gilad.denneboom@gmail.com>:
>>>>>
>>>>>  Yes, that could very well be the cause...
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 8, 2013 at 4:51 PM, Sera <news4sera@gmx.de> wrote:
>>>>>>
>>>>>>  Could it be a problem of latex?
>>>>>>
>>>>>>> I'm using it do generate the pdf.
>>>>>>>
>>>>>>> Am 08.11.2013, 16:40 Uhr, schrieb Sera <news4sera@gmx.de>:
>>>>>>>
>>>>>>>
>>>>>>> First, thanks for the code!
>>>>>>>
>>>>>>>  Unfurtanately, I still get a Nullpointer.
>>>>>>>> dests.getNames() is null.
>>>>>>>>
>>>>>>>> Am 04.11.2013, 13:38 Uhr, schrieb Gilad Denneboom <
>>>>>>>> gilad.denneboom@gmail.com>:
>>>>>>>>
>>>>>>>> You wrote the following code to do it:
>>>>>>>>
>>>>>>>>
>>>>>>>>>    public static int getPageNumberFromNamedDestination(PDDocument
>>>>>>>>> doc,
>>>>>>>>> String name) throws IOException {
>>>>>>>>>        PDDestinationNameTreeNode dests =
>>>>>>>>> doc.getDocumentCatalog().getNames().getDests();
>>>>>>>>>        if (dests==null || dests.getNames()==null)
>>>>>>>>>            return -1;
>>>>>>>>>        Object d = dests.getNames().get(name);
>>>>>>>>>        if (d==null)
>>>>>>>>>            return -1;
>>>>>>>>>        return getPageDestPageNumber(d);
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>>    public static int getPageDestPageNumber(Object dest)
{
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitDestination) {
>>>>>>>>>            PDPageFitDestination pageFitDestination =
>>>>>>>>> (PDPageFitDestination) dest;
>>>>>>>>>            return pageFitDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageXYZDestination) {
>>>>>>>>>            PDPageXYZDestination pageXYZDestination =
>>>>>>>>> (PDPageXYZDestination) dest;
>>>>>>>>>            return pageXYZDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitWidthDestination)
{
>>>>>>>>>            PDPageFitWidthDestination fitWidthDestination
=
>>>>>>>>> (PDPageFitWidthDestination) dest;
>>>>>>>>>            return fitWidthDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitHeightDestination)
{
>>>>>>>>>            PDPageFitHeightDestination fitHeightDestination
=
>>>>>>>>> (PDPageFitHeightDestination) dest;
>>>>>>>>>            return fitHeightDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitRectangleDestination)
{
>>>>>>>>>            PDPageFitRectangleDestination
>>>>>>>>> pageFitRectangleDestination
>>>>>>>>> =
>>>>>>>>> (PDPageFitRectangleDestination) dest;
>>>>>>>>>            return pageFitRectangleDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        return -1;
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Nov 3, 2013 at 1:39 PM, Sera <news4sera@gmx.de>
wrote:
>>>>>>>>>
>>>>>>>>> I've debugged it and it throws an exception.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> PDDestinationNameTreeNode node = (PDDestinationNameTreeNode)
>>>>>>>>>> document.getDocumentCatalog().getStructureTreeRoot().getIDTree();
>>>>>>>>>>
>>>>>>>>>> any idea what the correct way is?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 01.11.2013, 23:47 Uhr, schrieb Sera <news4sera@gmx.de>:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> is this the right way to get to the treenode?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Am 31.10.2013, 11:28 Uhr, schrieb Gilad Denneboom
<
>>>>>>>>>>> gilad.denneboom@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> If the destination is a PDNamedDestination object,
you have to
>>>>>>>>>>> cast
>>>>>>>>>>> it to
>>>>>>>>>>>
>>>>>>>>>>>  that class...
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  On Thu, Oct 31, 2013 at 11:24 AM, Sera <news4sera@gmx.de>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Do I have to cast Action to another type
than ActionGoTo? I
>>>>>>>>>>>> don't
>>>>>>>>>>>> see a
>>>>>>>>>>>>
>>>>>>>>>>>>  function getNamedDestination() in the suggestions
for my
>>>>>>>>>>>> objects.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 31.10.2013, 10:45 Uhr, schrieb Gilad
Denneboom <
>>>>>>>>>>>>> gilad.denneboom@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, so your bookmarks are not pointing
to page locations
>>>>>>>>>>>>> directly,
>>>>>>>>>>>>> but
>>>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>> Named Destinations. This makes things
more complex. You can use
>>>>>>>>>>>>>
>>>>>>>>>>>>>  getNamedDestination() to get the name
of the ND the bookmark
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> pointing
>>>>>>>>>>>>>> to. Of course, then you still need
to write a function that
>>>>>>>>>>>>>> looks up
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> specific ND in the tree (a PDDestinationNameTreeNode
object)
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> then
>>>>>>>>>>>>>> figures out to which page it's pointing
to by its value.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Oct 31, 2013 at 10:35 AM,
Sera <news4sera@gmx.de>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> when i make it toString() i get:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  org.apache.pdfbox.pdmodel.****interactive.****
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> documentnavigation.**
>>>>>>>>>>>>>>> destination.****PDNamedDestination@505484dc
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> whereas the last after @ is always
different. I think its the
>>>>>>>>>>>>>>> hashed
>>>>>>>>>>>>>>> destination?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am 31.10.2013, 10:20 Uhr, schrieb
Gilad Denneboom <
>>>>>>>>>>>>>>> gilad.denneboom@gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What do you mean by "hascode",
exactly?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  On Thu, Oct 31, 2013 at 10:16
AM, Sera <news4sera@gmx.de>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ok, now I've got the destination
as a hashcode. How do I get
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> pagenumber from this?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Am 30.10.2013, 20:10 Uhr,
schrieb Gilad Denneboom <
>>>>>>>>>>>>>>>>> gilad.denneboom@gmail.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Like I said, you need
to determine (using instanceof, for
>>>>>>>>>>>>>>>>> example)
>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> actual class it is, one
of the subsets of PDAction, like
>>>>>>>>>>>>>>>>> PDActionGoTo
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Oct 30, 2013
at 7:51 PM, Sera <news4sera@gmx.de>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> current.getAction()
is just a PDAction. From there I don't
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> access
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> getDestination().
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am 30.10.2013, 16:27
Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  gilad.denneboom@gmail.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You should get
the Action of the bookmark, and then check
>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> type
>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> action it is
(probably PDActionGoTo), and from the Action
>>>>>>>>>>>>>>>>>>> you'll
>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> access to the
Destination.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Oct
30, 2013 at 4:00 PM, Sera <news4sera@gmx.de
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hello!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I need to
extract the pagenumber out of the bookmarks
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> tried
>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  PDOutlineItem
current = bookmark.getFirstChild();
>>>>>>>>>>>>>>>>>>>>> PDDestination
destination = null;
>>>>>>>>>>>>>>>>>>>>> destination
= current.getDestination();
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> But the
destination stays null. Any ideas on how to fix
>>>>>>>>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Sera
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Erstellt
mit Operas E-Mail-Modul:
>>>>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Erstellt mit
Operas E-Mail-Modul:
>>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  Erstellt mit Operas
E-Mail-Modul:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  Erstellt mit Operas E-Mail-Modul:
>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  --
>>>>>>>>>>>
>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> --
>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>
>>>
>
> --
> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message