pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Itai Shaked (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
Date Thu, 06 Dec 2018 11:59:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711354#comment-16711354

Itai Shaked commented on PDFBOX-4392:

So I looked into it more closely using a profiler (should have done this in the first place,
as it revealed the call to `ensureDisplayProfile` has negligible performance impact), and
it seems lots of time is wasted on the call to `toRGB` in line 170 of PDICCBased.  According
to the comment above the call, this is done to test for bad ICC profiles as described in PDFBOX-1295,
PDFBOX-1740 and PDFBOX-3610.  PDFBOX-1740 seems totally unrelated, so perhaps a typo in the
comment? I have tried removing the call, and the files from PDFBOX-1295 and PDFBOX-3610 both
render correctly (even with the call - no exception is thrown).  Furthermore, removing this
call cuts render time on the file in this issue by ~50%. 

Could this be an issue with a bug in an older JDK version, and so is no longer needed? If
so - perhaps a test can be added so `toRGB` is only called on known bad JDK versions? 

> PDF completely blow up the RAM on amazon instances
> --------------------------------------------------
>                 Key: PDFBOX-4392
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4392
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.12
>            Reporter: Oleksandr Skoryi
>            Priority: Major
>             Fix For: 2.0.13
>         Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 4392-prereadICC.patch
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and render them.
In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is Perceptual,
ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF completely killed
the instance. The java process is just killed by linux during processing with no exception
in logs. 
> So could you please provide explanations what is going on with files with WARN message
above, and how can I improve the rendering. 
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
> And also a question, does GPU have influence on rendering?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org

View raw message