pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tilman Hausherr (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-4424) IOException when merging PDF documents containing URLs with unmatched brakets.
Date Tue, 08 Jan 2019 07:42:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736836#comment-16736836
] 

Tilman Hausherr commented on PDFBOX-4424:
-----------------------------------------

Why do you consider this to be a bug, that a malformed PDF is not parsed? Btw I get the same
problem when just loading the malformed PDF.

> IOException when merging PDF documents containing URLs with unmatched brakets.
> ------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4424
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4424
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.9, 2.0.13
>            Reporter: Paul Slauenwhite
>            Priority: Major
>         Attachments: PdfBoxDefect.java, Test_PDF_Not_Working.pdf, Test_PDF_Working.pdf
>
>
> Steps to reproduce:
> 1. Download the attached files to a directory.
> 2. Refactor the constants in PdfBoxDefect.java to reference the downloaded files.
> 3. Run PdfBoxDefect.java.
> 4. Note, the error merging error:
> Merging /Users/paulslauenwhite/Downloads/Test_PDF_Working.pdf to /Users/paulslauenwhite/Downloads/Test_PDF_Working_MERGED.pdf.
> Merged /Users/paulslauenwhite/Downloads/Test_PDF_Working.pdf to /Users/paulslauenwhite/Downloads/Test_PDF_Working_MERGED.pdf.
> Merging /Users/paulslauenwhite/Downloads/Test_PDF_Not_Working.pdf to /Users/paulslauenwhite/Downloads/Test_PDF_Not_Working_MERGED.pdf.
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Invalid dictionary, found: 'Æ' but expected: '/' at offset 13199
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Bad dictionary declaration at offset 13224
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Invalid dictionary, found: 'R' but expected: '/' at offset 13224
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Bad dictionary declaration at offset 13314
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Invalid dictionary, found: '_' but expected: '/' at offset 13314
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Corrupt object reference at offset 13574
> 15:59:32,120 [main]  WARN org.apache.pdfbox.pdfparser.BaseParser             
- Corrupt object reference at offset 13588
> java.io.IOException: Unknown dir object c=')' cInt=41 peek=')' peekInt=41 at offset 13588
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:961)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:631)
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:874)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:152)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
> at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:862)
> at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:852)
> at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:821)
> at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:741)
> at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:701)
> at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:205)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:240)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1144)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1060)
> at org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:261)
> at org.apache.pdfbox.multipdf.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:231)
> at PdfBoxDefect.merge(PdfBoxDefect.java:73)
> at PdfBoxDefect.main(PdfBoxDefect.java:48)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message