pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max Pimm <m...@alwayssunny.com>
Subject Best way to detect strings based on text style
Date Tue, 17 Jan 2012 10:21:43 GMT
I need to find the area occupied by certain strings in a document.

These strings are a sequence of characters with a predefined text style 
(font, font size and color).

I've been looking at the PrintTextLocations example. With the document 
i'm testing this processes text from the document character by character.

My initial plan is modify this to recognize words based on font, 
fontsize and color.

Is this the best strategy or would you recommend another? Is there any 
way to process words rather than characters?

On a side note, for documentation i'm using the javadoc and classes in 
the examples package. If you could point me towards any more explanatory 
information i would be grateful.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message