pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmeran Seehuber (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4242) Fontbox does not close file descriptor when loading fonts.
Date Mon, 18 Jun 2018 15:38:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16515894#comment-16515894
] 

Emmeran Seehuber commented on PDFBOX-4242:
------------------------------------------

[~tilman] subset() will only be called when fonts are used in the document. If for whatever
reason you are loading but not using a font, you will leak file handles... which can / will
bring your (web-)server down when the file handle limit is exhausted...

This can happen if you load all possible needed fonts upfront, but if they are used depends
on the data you put in the PDF. (e.g. a Chinese font is only used when they are really Chinese
characters etc.). I had this in production with OpenHTMLToPDF, see also [https://github.com/danfickle/openhtmltopdf/pull/215].
The workaround was to subset() all loaded fonts manually. As we had a handle on the TrueTypeFont
I tried to close() it directly. But this causes a NPE as RAFDataStream.close() violates the
close() contract, namely that calling close() twice should have no effect. But when called
the second time RAFDataStream.close() will just throw a NPE. 

It would be nice if RAFDataStream.close() could be fixed (i.e. putting a if(raf!=null) before
the raf.close()).

> Fontbox does not close file descriptor when loading fonts.
> ----------------------------------------------------------
>
>                 Key: PDFBOX-4242
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4242
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.9
>            Reporter: Glen Peterson
>            Priority: Minor
>              Labels: file_leak
>
> My app has been getting "java.io.FileNotFoundException (No file descriptors available)"
and I've confirmed that it's because fontbox isn't closing it's file descriptors.
> In org.apache.fontbox.ttf.TTFParser there's this method:
> {{public TrueTypeFont parse(File ttfFile) throws IOException {}}
>  {{  RAFDataStream raf = new RAFDataStream(ttfFile, "r");}}
> {{  try {}}
>  {{    return this.parse((TTFDataStream)raf);}}
>  {{  } catch (IOException var4) {}}
>  {{    // close only on error (file is still being accessed later)}}
>  {{    raf.close();}}
>  {{    throw var4;}}
>  {{}}}
>  {{}}}
> I would have expected to see the close() in a finally block so that the file is always
closed, not just on exceptions. Presumably, you can keep it in memory without leaving the
file descriptor open?
> {{public TrueTypeFont parse(File ttfFile) throws IOException {}}
>  {{  RAFDataStream raf = new RAFDataStream(ttfFile, "r");}}
> {{  try {}}
>  {{    return this.parse((TTFDataStream)raf);}}
>  {{  } catch (IOException var4) {}}{{    raf.close();}}
>  {{    throw var4;}}
>  {{  } finally {}}
>  {{    raf.close();}}
>  {{}}}
>  {{}}}
> I tried performing this in a lazy initialization, but it blew up:
> java.lang.RuntimeException: java.io.IOException: The TrueType font null does not contain
a 'cmap' tableCaused by: java.io.IOException: The TrueType font null does not contain a 'cmap'
table
>   at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapImpl(TrueTypeFont.java:548)
>   at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:528)
>   at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:514)
>   at org.apache.fontbox.ttf.TTFSubsetter.<init>(TTFSubsetter.java:91)
>   at org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.subset(TrueTypeEmbedder.java:321)
>   at org.apache.pdfbox.pdmodel.font.PDType0Font.subset(PDType0Font.java:239)
>   at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1271)
> Thoughts?
> Thanks for PDFBox - it's been really helpful!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message