pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Itai Shaked (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
Date Sun, 09 Dec 2018 09:43:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713909#comment-16713909
] 

Itai Shaked commented on PDFBOX-4392:
-------------------------------------

[~tilman] - As profiling shows there is almost no impact to re-parsing the ICC profile, I
don't think I will continue working on the patch. At most it will have negligible performance
improvement, and it means more code to be maintained in PDFBox which already exists in the
JDK, so I personally don't think it will be worth it. 

What I am thinking of maybe doing is trying to find a way to "predict" whether LCMS/KCMS will
have a problem with the ICC profile, so maybe fallback can be performed without actually calling
`new Color` or other potentially-slow operations. I am not entirely sure it is possible, but
learning more about ICC profiles was something I've been meaning to do anyway. 

If I do manage to find a quick and reliable way to test the validity (in the sense tested
by the lines discussed here) of an ICC profile I will create a new patch, but perhaps by then
it would make sense to create a new issue altogether. 

> PDF completely blow up the RAM on amazon instances
> --------------------------------------------------
>
>                 Key: PDFBOX-4392
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4392
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.12
>            Reporter: Oleksandr Skoryi
>            Priority: Major
>             Fix For: 2.0.14
>
>         Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and render them.
In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is Perceptual,
ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF completely killed
the instance. The java process is just killed by linux during processing with no exception
in logs. 
> So could you please provide explanations what is going on with files with WARN message
above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message