pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Fu <gangfu1...@gmail.com>
Subject Re: extract chinese letters from pdf
Date Wed, 29 Apr 2015 12:53:31 GMT
Hi John,

I have tried to download the latest source codes using both svn (
http://pdfbox.apache.org/downloads.cgi#scm) and git clone (
https://github.com/apache/pdfbox),

When I built, I got the same error showing one file cannot be downloaded:

[WARNING] Retrying (1 more)
Downloading:
http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip
1K downloaded
[WARNING] Could not get content
org.apache.maven.plugin.MojoFailureException: Not same digest as expected:
expected <9f129c834bc6f9f8dabad4491c4c10ec> was
<0711b5cb6e5b0eed472b2c1c8a341431>


[ERROR] Failed to execute goal
com.googlecode.maven-download-plugin:download-maven-plugin:1.2.1:wget
(get-isartor) on project preflight: IO Error: Could not get content ->
[Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
goal com.googlecode.maven-download-plugin:download-maven-plugin:1.2.1:wget
(get-isartor) on project preflight: IO Error


Do you know how to resolve this problem?
Best,
Gang

On Wed, Apr 29, 2015 at 1:36 AM, John Hewson <john@jahewson.com> wrote:

>
> > On 28 Apr 2015, at 20:21, Gang Fu <gangfu1982@gmail.com> wrote:
> >
> > Hi,
> >
> >
> > I want to parse the PDF file with both Chinese and English letters.
> Which encoding should I use?
> >
> > The sample file is attached.
> >
>
> UTF-8. You want to use the trunk version of PDFBox (2.0) too.
>
> Our mailing list removes binary attachments, so you’ll have to post your
> PDF file somewhere public so that we can see it.
>
> — John
> > Thank you very much!
> >
> >
> > Best,
> >
> > Gang
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message