pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Slava G <slav...@gmail.com>
Subject Re: Corrupted PDF file causing severe OOM
Date Thu, 16 May 2019 06:46:18 GMT
Got you.
Thanks

On Thu, May 16, 2019 at 6:42 AM Tilman Hausherr <THausherr@t-online.de>
wrote:

> Am 15.05.2019 um 21:57 schrieb Slava G:
> > But I tried to extract text using 2.0.15 and got immidiatelly exception
> and
> > didn't get OOM.
>
>
> I got slow response on the second page. I didn't wait until OOM.
>
> Tilman
>
>
>
> >
> > On Wed, May 15, 2019, 22:52 Tilman Hausherr <THausherr@t-online.de>
> wrote:
> >
> >> Am 15.05.2019 um 16:00 schrieb Slava G:
> >>> But seems that in PDFBox 2.0.15 it's already fixed as, when I run
> >> tika-app
> >>
> >>
> >> No it's not fixed. The cause is a corrupt ToUnicode stream. Fixed in
> >>
> >> https://issues.apache.org/jira/browse/PDFBOX-4550
> >>
> >> Try a snapshot within a few hours
> >>
> >>
> >>
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.16-SNAPSHOT/
> >>
> >> Tilman
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message