pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krzysztof Podsiadło (JIRA) <j...@apache.org>
Subject [jira] [Created] (PDFBOX-4442) Loading files larger than available memory
Date Wed, 23 Jan 2019 16:09:00 GMT
Krzysztof Podsiadło created PDFBOX-4442:
-------------------------------------------

             Summary: Loading files larger than available memory
                 Key: PDFBOX-4442
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4442
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 2.0.13
            Reporter: Krzysztof Podsiadło


I am trying to load a huge (8GB) PDF. As a result I am getting OutOfMemoryException. Is it
even possible to load a file larger than available memory?

Sample program:
{code:java}
public static void main(String[] args) {
    File file = new File("pdf_8Gb.pdf");
    try(InputStream inputStream = new FileInputStream(file)) {
        try (final PDDocument document = PDDocument.load(inputStream, MemoryUsageSetting.setupTempFileOnly()))
{ //line 13
            System.out.println("Success");
        } catch (final InvalidPasswordException e) {
            e.printStackTrace();
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}{code}
 

Exception stacktrace:


{code:java}
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.base/java.util.LinkedHashMap.newNode(LinkedHashMap.java:256)
    at java.base/java.util.HashMap.putVal(HashMap.java:626)
    at java.base/java.util.HashMap.put(HashMap.java:607)
    at org.apache.pdfbox.cos.COSDictionary.setItem(COSDictionary.java:217)
    at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:304)
    at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
    at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:864)
    at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:904)
    at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:873)
    at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:793)
    at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:753)
    at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:187)
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:226)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1200)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1116)
    at pdfbox.test.PdfLoader.main(PdfLoader.java:13){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message