pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Itai Shaked (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
Date Mon, 03 Dec 2018 12:58:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707163#comment-16707163
] 

Itai Shaked commented on PDFBOX-4392:
-------------------------------------

Regardless of whether the Intent is checked correctly, I have put together a patch to read
the profile data and make the necessary fix before calling ICC_Profile.getInstance. Testing
on the file attached here - before this change it takes 104s to render on my machine, and
after the change it takes 62s, so at least in such cases with thousands of problematic profiles
it makes a significant difference. 

I do not have the time currently for more extensive tests, and the patch could use some refining
(currently on errors I return null, should probably throw an exception instead, etc.), but
I'm hoping to get to it later this week (unless someone else does it first). 

 

[^4392-prereadICC.patch]

> PDF completely blow up the RAM on amazon instances
> --------------------------------------------------
>
>                 Key: PDFBOX-4392
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4392
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.12
>            Reporter: Oleksandr Skoryi
>            Priority: Major
>             Fix For: 2.0.13
>
>         Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and render them.
In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is Perceptual,
ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF completely killed
the instance. The java process is just killed by linux during processing with no exception
in logs. 
> So could you please provide explanations what is going on with files with WARN message
above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message