pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel" <gabr...@bry.com.br>
Subject RES: RES: Trouble loading large files with PDFBox 2.0.2
Date Mon, 05 Sep 2016 11:24:13 GMT
Thank you very much, I'll do it and come back with a reply.

-----Mensagem original-----
De: Tilman Hausherr [mailto:THausherr@t-online.de] 
Enviada em: sábado, 3 de setembro de 2016 12:28
Para: users@pdfbox.apache.org
Assunto: Re: RES: Trouble loading large files with PDFBox 2.0.2

Am 02.09.2016 um 17:15 schrieb Tilman Hausherr:
>
> Maybe compare with loadNonSeq in the 1.8 version. The 2.0 parser is 
> different than the old parser, and maybe slower because it has to go 
> back and forth.
>
> You said your file is confidential. What you could do, if you have 
> some time to kill, go here:
> http://digitalcorpora.org/corp/files/govdocs1/zipfiles/
>
> try to find some PDF file that has this effect. 


Here are some large ffiles from there, all are larger than 15MB. To get one, download the
file from the link above that starts with the first 3 numbers. I.e. to get 475419.pdf, you
need to download

http://digitalcorpora.org/corp/files/govdocs1/zipfiles/475.zip

Then do your test and if you think it is much slower, mention the file here.

475419.pdf
620038.pdf
302439.pdf
209086.pdf
755045.pdf
503657.pdf
767115.pdf
942416.pdf
364591.pdf
240242.pdf
574442.pdf
560466.pdf
134823.pdf
234570.pdf
071300.pdf
884613.pdf
022391.pdf
160655.pdf
898927.pdf
509787.pdf
125112.pdf
486395.pdf
510488.pdf
586504.pdf
510944.pdf
861355.pdf
765184.pdf
659153.pdf
173325.pdf
577932.pdf
990065.pdf
660170.pdf
660166.pdf
584460.pdf
979241.pdf
483173.pdf
078656.pdf
296771.pdf
434841.pdf
922401.pdf


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message