pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Big Donkeys <big.donk...@yahoo.com>
Subject Can't extract text Adobe-WinCharSetFFFF-UCS2
Date Thu, 19 Jul 2012 20:09:01 GMT
Hi, I&#39;m having some troubles extracting text from some South Korean PDF files using
PDFTextStripper.  When I try I get a "severe error could not parse predefined CMAP file for
&#39;Adobe-WinCharSetFFFF-UCS2&#39;" message and then gives me some gibberish.  File
opens and displays fine in Adobe reader.   I&#39;m using pdfbox-app-1.7.0.jar.

Here is a link to an example PDF that gives me trouble:


Any ideas?  

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message