pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Kehl <walter.k...@outlook.com>
Subject Corrupted words when using PDFTextStripper
Date Mon, 09 Jun 2014 09:55:02 GMT


I am new to the list so I don't know whether this has been asked before:


I am using PDFTextStripper (embedded into another application) to get the
raw text of PDFs so far with good results but recently a PDF file has
appeared where the output of the PDFTextStripper was corrupted. I got
sentences like:


"There is al o con ern that b nkers may be pushed to misprice risk (No. 6)
by the pres ures of c mpetition and an abunda ce of central b nk-provided


where characters seem to be missing. Does anyone have any idea what went
wrong here and how could I prevent it?




Thanks for your help


Walter Kehl




  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message