pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: How do I analyze a problem PDF?
Date Wed, 01 Mar 2017 08:29:57 GMT
Am 28.02.2017 um 23:51 schrieb Thad Humphries:
> No, the document has not been closed prematurely.

and what's that?

inDoc.close();

....

document.save();



Tilman




>   It's being processed
> through the same calls that I use for all my other documents that must
> merge a PDF, either ones I create or ones from a repository. In this code
>
>      ...
>      File outpath = new File(OUT_DIR, "mergedTwoPdfs.pdf");
>      document.save(new FileOutputStream(outpath.toString()));
>      ...
>
> when I trace the execution in Eclipse, document's close member is false
> immediately before calling document.save().
>
> Something else interesting: When I use PDFMerge to merge the good, 40K PDF
> with another document, the output PDF is only 2K larger than the two files
> themselves. But when I merge the original 39K PDF, the output is almost 40K
> larger than the two files.
>
> The code below will cause the error. The two portions in curly brackets are
> (essentially) the merge method in my PrintToPdf class. The stack trace is
> the same. The odd PDF is the second one, "moroccan_chicken.pdf".
>
> I can see about a place to post it tomorrow, through a short-term anonymous
> FTP at my office. In the meanwhile I'll see if anything from PDFDebugger
> makes sense to me. :)
>
>
>    @Test
>    public void testMerge2PdfDocs() throws Exception {
>      File file0 = new File(this.getClass().getResource("/Bacon and Brussels
> Sprout Hash.pdf").toURI());
>      byte [] buf0 = IOUtils.toByteArray(new FileInputStream(file0));
>      File file1 = new
> File(this.getClass().getResource("/moroccan_chicken.pdf").toURI());
>      byte [] buf1 = IOUtils.toByteArray(new FileInputStream(file1));
>      PDDocument document = new PDDocument();
>      {
>        PDFMergerUtility merger = new PDFMergerUtility();
>        PDDocument inDoc = PDDocument.load(buf0);
>        merger.appendDocument(document, inDoc);
>        inDoc.close();
>      }
>      {
>        PDFMergerUtility merger = new PDFMergerUtility();
>        PDDocument inDoc = PDDocument.load(buf1);
>        merger.appendDocument(document, inDoc);
>        inDoc.close();
>      }
>      File outpath = new File(OUT_DIR, "mergedTwoPdfs.pdf");
>      document.save(new FileOutputStream(outpath.toString()));
>      document.close();
>      assert true;
>    }
>
> On Tue, Feb 28, 2017 at 5:30 PM, Tilman Hausherr <THausherr@t-online.de>
> wrote:
>
>> The best would be to upload the PDF somewhere, and also post your code.
>>
>> I analyse PDFs sometimes with NOTEPAD++, sometimes with PDFDebugger, and
>> often both. But these help only those who know what to expect.
>>
>> The text below looks like a COSStream was closed prematurely (did you
>> close the source documents too early?). I'd rather suspect a bug in your
>> code or in our code.
>>
>> Tilman
>>
>>
>>
>>
>> Am 28.02.2017 um 23:16 schrieb Thad Humphries:
>>
>>> I have a PDF 1.4 document that opens in different PDF viewers without
>>> warnings, yet there seems to be something odd about it. How might I
>>> analyze
>>> it?
>>>
>>> If I merge this PDF from the command line with pdfbox-app-2.0.4.jar's
>>> PDFMerger, the output is fine. However anytime I merge it in my own code,
>>> where it is first opened into a byte array, loaded to a document, then
>>> call
>>> PDFMergerUtility's appendDocument(destination, source), the
>>> destination PDDocument cannot be saved to disk. This is the only PDF of
>>> several dozen I've tested with this problem. I see nothing odd when I
>>> trace
>>> the program in a debugger. It fails only at PDDocument save() (stack trace
>>> below).
>>>
>>> If I open the original PDF (39K) in MacOSX Yosemite's Preview and save it,
>>> the saved PDF is now 40K, and it merges just fine in my code. (I believe
>>> this PDF was created about 5 years ago, most likely with the export-to-PDF
>>> action using OpenOffice on Linux. )
>>>
>>> I expect that I'll see odd PDFs like this from time to time. (Lord knows
>>> that I've amassed quite a collection of buggy TIFF images over the years.)
>>> With TIFFs I can find out a lot using libtiff's tiffinfo and tiffdump
>>> utilities and a hex editor. Are there any routines in PDFBox that might
>>> help me with PDF files? Are there any other tools, open source or
>>> commercial?
>>>
>>>
>>> Stack trace from JUnit:
>>>
>>> java.io.IOException: COSStream has been closed and cannot be read. Perhaps
>>> its enclosing PDDocument has been closed?
>>> at org.apache.pdfbox.cos.COSStream.checkClosed(COSStream.java:77)
>>> at org.apache.pdfbox.cos.COSStream.createRawInputStream(COSStre
>>> am.java:125)
>>> at
>>> org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWri
>>> ter.java:1200)
>>> at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:383)
>>> at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:158)
>>> at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWrite
>>> r.java:522)
>>> at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObjects(COSWrit
>>> er.java:460)
>>> at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:444)
>>> at
>>> org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSW
>>> riter.java:1096)
>>> at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:419)
>>> at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1367)
>>> at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1254)
>>> at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1232)
>>> at
>>> com.jthad.util.image.TestPrintToPdf.testMergeTwoPdfDocs(Test
>>> PrintToPdf.java:192)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:39)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
>>> FrameworkMethod.java:47)
>>> at
>>> org.junit.internal.runners.model.ReflectiveCallable.run(Refl
>>> ectiveCallable.java:12)
>>> at
>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
>>> ameworkMethod.java:44)
>>> at
>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(
>>> InvokeMethod.java:17)
>>> at
>>> org.junit.internal.runners.statements.RunBefores.evaluate(
>>> RunBefores.java:26)
>>> at
>>> org.junit.internal.runners.statements.RunAfters.evaluate(Run
>>> Afters.java:27)
>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>>> at
>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>>> 4ClassRunner.java:70)
>>> at
>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>>> 4ClassRunner.java:50)
>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>>> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>> at
>>> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.
>>> run(JUnit4TestReference.java:86)
>>> at
>>> org.eclipse.jdt.internal.junit.runner.TestExecution.run(
>>> TestExecution.java:38)
>>> at
>>> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTe
>>> sts(RemoteTestRunner.java:459)
>>> at
>>> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTe
>>> sts(RemoteTestRunner.java:678)
>>> at
>>> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(
>>> RemoteTestRunner.java:382)
>>> at
>>> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(
>>> RemoteTestRunner.java:192)
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message