pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: unijis-ucs2-hw-h problems
Date Sat, 25 Jul 2015 11:23:20 GMT
Hello,

1. Does text extraction work with Adobe Reader?
2. Could you upload the file to a public location?

Tilman

Am 25.07.2015 um 09:42 schrieb 牛小伟:
> Dear team:
>           We are using your product pdfbox 1.6 to do text extraction.
> But when we are processing the encoding(UniJIS-UCS2-HW-H),
> it appears unreadable code like this(????????????????????????3?????????????).
> We have tried some other ways to process it. But they don't work.
> We also have some doc with the encoding(GBK-EUC-H),the pdfbox
> can work perfectly. I also tried the pdfbox 1.8, it also didn't work.
> I checked the charset of the pdfbox. It contains both of the encoding.
> I don't know why one is working, another is not working.
> Hope your support for this .Very thanks.
>
>
> Best Regard.
>
>
> the docsnapshot of the encoding:
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message