pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching
Date Thu, 06 Dec 2018 08:00:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711099#comment-16711099
] 

Ben Manes commented on PDFBOX-4396:
-----------------------------------

This is probably covered in your references tickets. In ScratchFileBuffer it states,
{code:java}
/**
 * While calling finalize is normally discouraged we will have to
 * use it here as long as closing a scratch file buffer is not 
 * done in every case. Currently {@link COSStream} creates new
 * buffers without closing the old one - which might still be
 * used.
 * 
 * <p>Enabling debugging one will see if there are still cases
 * where the buffer is not closed.</p>
 */{code}
I wasn't able to reproduce the problem in isolation on the PDF document that failed (450mb,
999 pages). I could process it locally in ~12 minutes, the same as ghostscript. It may be
due to the additional load put on the machine as the processing is cpu heavy, I process multiple
pdfs and pages in parallel, and there is other incoming work. As G1 is is quota driven, likely
the cpu thrashing is causes it to not have its work finished within the desired timeframes.
When it exhausts its quota and is unable to keep up, that would eventually lead to an OOME.
Since Java lacks functioning thread priorities, we it can't de-emphasize application threads
for the collector. If G1 has moved away from stop-the-world to failing, then it cannot recover
in this scenario. Since G1 has constantly changed, it's hard to pinpoint as descriptions from
years ago are no longer accurate and likely they optimized against handling this case, preferring
the application was fixed to be better behaved.

So far my fixes do seem to be chugging along and past the failure point, but still has more
work before it's in the clear. I disabled caching (no obvious perf hit), discard a PDDocument
every 25 pages, and call GC each time a PDDocument is closed. I may look into using ghostscript
and lambda functions instead, to distribute the work and offload from application servers.

In regards to JDK10, there are some build tools not yet JDK11 compatible that I am waiting
on. It takes some work to be JDK9 compatible, though 9=>10 was effortless. The 11 transition
is more work due to additional module removals. I have 11 prototyped, but its stuck on an
infinite compilation bug, due to using Gradle 4.x (incompatible) and a plugin not yet released with
Gradle 5 support.

> Memory leak due to soft reference caching
> -----------------------------------------
>
>                 Key: PDFBOX-4396
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4396
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.12
>         Environment: JDK10; G1
>            Reporter: Ben Manes
>            Priority: Major
>         Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory leak 2.png,
memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of memory due
to buffered images (via PDImageXObject). I suspect that G1 is not collecting soft references
across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 I/O bug.
Previously I was loading the document to render each page, but this took 1.5 minutes. To work
around that bug I reused the document instance across pages. This seems to have fail because
the pages were cached and not cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft references are
collected. Like WeakHashMap, it should use a ReferenceQueue, poll it on every access, and
prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset the cache
to a new instance after a page has been rendered. The entries should no longer be reachable
and be GC'd more aggressively. If that doesn't work, I'll either replace the cache (e.g. with
Caffeine) or disable it by setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, reconsider
usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message