pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kulbhushan singh <kulbhushan.t...@gmail.com>
Subject Fwd: Junk Characters while Extracting text from pdf file.
Date Tue, 05 Feb 2013 14:01:53 GMT

I am trying to extract text from a pdf file with custom fonts but it is
giving me junk characters. The fonts used are ArialMT (embedded subset) &
Arial-BoldMT (embedded subset). The producer of pdf file is GPL Ghost
script 8.15. I am using PDFTextStripper to extract the text. How can do it
for custom fonts. Any reference or solution would be appreciated.

Regards, Kulbhushan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message