pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: More questions about page iteration
Date Tue, 16 May 2017 12:42:02 GMT
Am 16.05.2017 um 14:35 schrieb David Patterson:
> Tilman,
>
> The code I tried is:
>
> byte[] bytes = // content of file as a byte array
> PDDocument pdDocument = PDDocument.load( bytes );
> PDDocumentCatalog cat2 = pdDocument.getDocumentCatalog();
> PDPageLabels pageLabels = cat2.getPageLabels();
> if ( pageLabels == null ) {
> System.out.println( "Page labels missing " );
> }
>
>
> I'm getting "Page labels missing" on each document.

Then lets go back to the beginning. You mentioned "I've got page numbers 
like "TOC-1", "TOC-2", "Page 1"". Where did these show up?

Tilman


>
> I have no idea of, or control over the process used to convert a Word file
> into a PDF. I just inherited a bunch of PDFs that I'm trying to interpret.
>
> Dave Patterson
>
> On Mon, May 15, 2017 at 1:57 PM, Tilman Hausherr <THausherr@t-online.de>
> wrote:
>
>> Am 15.05.2017 um 19:11 schrieb David Patterson:
>>
>>> Alas, after testing with my documents, the PageLabels is null. :-(
>>>
>> But you said it has "TOC-1". This sounds like pagelabels. You can also try
>> with PDFDebugger, it will show the labels if there are some.
>>
>> Tilman
>>
>>
>>
>>> Thank you for the help and encouragement.
>>>
>>> Dave Patterson
>>>
>>> On Mon, May 15, 2017 at 12:34 PM, Tilman Hausherr <THausherr@t-online.de>
>>> wrote:
>>>
>>> Am 15.05.2017 um 18:30 schrieb David Patterson:
>>>> Tilman,
>>>>> Thank you very much. (I feel bad asking some of the questions, but the
>>>>> data
>>>>> is stored in "out of the way" corners that are hard to find.
>>>>>
>>>>> Don't :-)
>>>>
>>>> Is there any documentation that explains how the linkages work? Would it
>>>>> help to have the PDF Standard Document?
>>>>>
>>>>>
>>>> Yes. I read there all the time. The PDFBox API closely follows the PDF
>>>> specification. So here it's linked from the document catalog, so the
>>>> methods used are in the PDDocumentCatalog class. But asking was a good
>>>> decision as this got you that convenience method (that is in
>>>> PDFDebugger).
>>>>
>>>> Tilman
>>>>
>>>>
>>>>
>>>> Thanks.
>>>>> Dave Patterson
>>>>>
>>>>> On Mon, May 15, 2017 at 12:13 PM, Tilman Hausherr <
>>>>> THausherr@t-online.de>
>>>>> wrote:
>>>>>
>>>>> Am 15.05.2017 um 15:20 schrieb David Patterson:
>>>>>
>>>>>> I've now got my code working to iterate through a PDDocument and
>>>>>> process
>>>>>>
>>>>>>> it
>>>>>>> page by page.
>>>>>>>
>>>>>>> Next hurdle: Is there a way to get the page number as printed?
I've
>>>>>>> got
>>>>>>> page numbers like "TOC-1", "TOC-2", "Page 1", ...
>>>>>>>
>>>>>>> How much work is it to get the "TOC-1"?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Dave Patterson
>>>>>>>
>>>>>>>
>>>>>>>        /**
>>>>>>>
>>>>>>         * Convenience method to get the page label if available.
>>>>>>         *
>>>>>>         * @param document
>>>>>>         * @param pageIndex 0-based page number.
>>>>>>         * @return a page label or null if not available.
>>>>>>         */
>>>>>>        public static String getPageLabel(PDDocument document, int
>>>>>> pageIndex)
>>>>>>        {
>>>>>>            PDPageLabels pageLabels;
>>>>>>            try
>>>>>>            {
>>>>>>                pageLabels = document.getDocumentCatalog().
>>>>>> getPageLabels();
>>>>>>            }
>>>>>>            catch (IOException ex)
>>>>>>            {
>>>>>>                return ex.getMessage();
>>>>>>            }
>>>>>>            if (pageLabels != null)
>>>>>>            {
>>>>>>                String[] labels = pageLabels.getLabelsByPageIndices();
>>>>>>                if (labels[pageIndex] != null)
>>>>>>                {
>>>>>>                    return labels[pageIndex];
>>>>>>                }
>>>>>>            }
>>>>>>            return null;
>>>>>>        }
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message