pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Kehl <walter.k...@outlook.com>
Subject Corrupted words when using PDFTextStripper
Date Mon, 09 Jun 2014 09:55:02 GMT
Hi, 

 

I am new to the list so I don't know whether this has been asked before:

 

I am using PDFTextStripper (embedded into another application) to get the
raw text of PDFs so far with good results but recently a PDF file has
appeared where the output of the PDFTextStripper was corrupted. I got
sentences like:

 

"There is al o con ern that b nkers may be pushed to misprice risk (No. 6)
by the pres ures of c mpetition and an abunda ce of central b nk-provided
liquidity."

 

where characters seem to be missing. Does anyone have any idea what went
wrong here and how could I prevent it?

 

 

 

Thanks for your help

 

Walter Kehl

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message