pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: PDFTextStripper question
Date Fri, 05 Jun 2015 19:33:48 GMT
If you’re also interested in getting the bounding boxes of individual glyphs then check out:

https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/rendering/CustomPageDrawer.java
<https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/rendering/CustomPageDrawer.java>

— John

> On 5 Jun 2015, at 09:31, Lorena Leishman <lorenaleishman@yahoo.com.INVALID> wrote:
> 
> I'll do. Thanks! 
>      From: Tilman Hausherr <THausherr@t-online.de>
> To: users@pdfbox.apache.org 
> Sent: Friday, June 5, 2015 10:13 AM
> Subject: Re: PDFTextStripper question
> 
> Yes, see the PrintTextLocations.java example.
> 
> See also
> https://stackoverflow.com/questions/11873801/using-pdfbox-to-determine-the-coordinates-of-words-in-a-document
> https://stackoverflow.com/questions/16579146/pdfbox-1-8-printtextlocations-wrong-textposition-height-for-a-multi-page-pdf
> https://stackoverflow.com/questions/21207943/pdfbox-text-extraction-with-bold-italic-info-does-not-work-on-some-files
> for possible problems / solutions.
> 
> Tilman
> 
> 
> 
> Am 05.06.2015 um 16:55 schrieb Lorena Leishman:
>> Is there a way to use PDFTextStripper and return the text in the position they were
at in the pdf? or Is there a way to return the position where words were at?
>> Lorena
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 
> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message