pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rachel Arbit <rac...@citypath.com>
Subject Page number discrepancy
Date Mon, 30 Jul 2012 13:17:15 GMT
Hi all,
I'm using PDFbox on an book in PDF format, I'm trying to map between terms
mentioned in the index and the chapters.

As I understand it, there are actually two page numbering schemes. One is
the straight page number in the pdf, which counts the cover page, all the
introductory pages, etc. The second is the numbering as it appears on the
pages of the book, so that all the introductory page numbers are roman
numerals, and page 1 is only on page 25 of the PDF.
When I look at the PDF using a reader it shows me both numbers, e.g.: xi
(12 / 841) or 18 (43 / 841)

I'm only managing to get the number of the page in the PDF, and not the
number as it's written on the page. I need to be able to get that number
because the index uses these numbers, to map terms to pages. E.g. page 15
in the index actually means page 40 of the pdf.

Anyone have any idea how to get the numbers on the pages? Is that info part
of the PDF at all?

Thanks in advance!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message