pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Walker" <jo...@newconceptsdev.com>
Subject Problems Using PDFBox To Manually Track TextPosition
Date Sat, 15 Aug 2015 00:06:28 GMT


I'm using PDFBox to parse the contentstream for a page in a PDF.   Based on
the list of operations, there are two lines of text that I expect to be in
very different places on the page vertically.  However, when the page is
displayed in Sumatra or Acrobat, this text is vertically aligned.


The method I'm using to predict text position has been accurate in the past.
I'm not sure if the method is faulty, or if I'm mis-understanding the
operation list I'm getting from PDFBox.


Here is the list of operations, with annotations explaining how I think they
should impact vertical position of text cursor: 




As you can see, I'm basically only moving my model of the cursor in reaction
to Tm's and Td's.  (TJ's aren't relevant because text is horizontal and the
y position is the one I'm tracking.)   I also ignored the cm, because
there's a Tm right after it.


Am I mis-interpreting the PDF Operators (as I suspect)?  Is there any
potential that this is a PDFBox issue?  


Thanks in advance!



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message