pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Morin <mo...@codelutin.com>
Subject Re: Error on PDDocument.load
Date Fri, 20 Mar 2015 14:59:52 GMT
HI,

a little up ;)

Have a nice weekend.
BR

Kevin

On 02/03/2015 16:19, Kevin Morin wrote:
> Hi,
>
> Andreas, you said in the issue that you have a solution in mind, did you
> succeed in fixing it or not? It seems that my users have a lot of files
> of this kind...
>
> Thanks
> BR
>
> Kevin
>
> On 11/02/2015 23:16, Tilman Hausherr wrote:
>> I wasn't able to create a non confidential version of the file that
>> works with Adobe Reader. But here's an issue and a proposed patch.
>>
>> https://issues.apache.org/jira/browse/PDFBOX-2679
>>
>> Tilman
>>
>> Am 11.02.2015 um 18:54 schrieb Tilman Hausherr:
>>> No, his file is confidential.
>>>
>>> However we might create a non confidential file that has the same error.
>>>
>>> Tilman
>>>
>>> Am 11.02.2015 um 18:40 schrieb John Hewson:
>>>> Can we get a JIRA issue open for this, preferably with the file
>>>> attached?
>>>>
>>>> -- John
>>>>
>>>>> On 11 Feb 2015, at 00:29, Tilman Hausherr <THausherr@t-online.de>
>>>>> wrote:
>>>>>
>>>>> Yes, they made hacks. So did we, for many types of malformed files.
>>>>> Please send the file also to Andreas, unless you already did, he did
>>>>> many workarounds for malformed files.
>>>>>
>>>>> Tilman
>>>>>
>>>>>> Am 11.02.2015 um 09:05 schrieb Kevin Morin:
>>>>>> Ok. Why other softwares are able to open it (like xpf)? I guess
>>>>>> they made a hack to fix this? Are you going to do something too?
>>>>>>
>>>>>> Thanks
>>>>>> BR
>>>>>>
>>>>>> Kevin
>>>>>>
>>>>>>> On 11/02/2015 08:53, Tilman Hausherr wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I can reproduce the error. Your file is malformed. Please open
it
>>>>>>> with
>>>>>>> NOTEPAD++ and go to the end:
>>>>>>>
>>>>>>> xref
>>>>>>> 1 7
>>>>>>> 0000000000 65535 f
>>>>>>> 0000000009 00000 n
>>>>>>> 0000358745 00000 n
>>>>>>> 0000358842 00000 n
>>>>>>> 0000359029 00000 n
>>>>>>> 0000359087 00000 n
>>>>>>> 0000359138 00000 n
>>>>>>> trailer
>>>>>>>
>>>>>>> The first number (1) means the number of the first object. So
it
>>>>>>> would
>>>>>>> be 1. The second number(7) is the size of the table. The number
1 is
>>>>>>> incorrect, it should be 0, because "0000000000 65535 f" is the
dummy
>>>>>>> object 0. Press CTRL-G and enter the offsets (e.g. 9, 45, 358745,
>>>>>>> ...)
>>>>>>> and you will see what I mean.
>>>>>>>
>>>>>>>  From the pdf spec:
>>>>>>>
>>>>>>> The free entries in the cross-reference table form a linked list,
>>>>>>> with
>>>>>>> each free entry containing the object number of the next. The
first
>>>>>>> entry in the table (object number 0) is always free and has a
>>>>>>> generation
>>>>>>> number of 65,535; it is the head of the linked list of free objects
>>>>>>>
>>>>>>> Tilman
>>>>>>>
>>>>>>>
>>>>>>>> Am 11.02.2015 um 08:21 schrieb Kevin Morin:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am sorry, it seems that I did not send you the right file...
>>>>>>>> Actually, I was testing the wrong file on linux from the
begining
>>>>>>>> also. The file is displaying blank also on linux and on java
7 or
>>>>>>>> 8...
>>>>>>>> Here is the right file.
>>>>>>>>
>>>>>>>> I am sorry to make you work for nothing...
>>>>>>>>
>>>>>>>> BR
>>>>>>>>
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 10/02/2015 21:32, Tilman Hausherr wrote:
>>>>>>>>> So we e-mailed and the result is
>>>>>>>>> - you're really working on W2008 with the file that you
sent me
>>>>>>>>> - you get the same error on W2008 with the app (and I
don't)
>>>>>>>>>
>>>>>>>>> I have analysed that file and did some debug traces.
If loading
>>>>>>>>> that on
>>>>>>>>> W2008 is a no-no, you'd have to build from source and
I'll tell
>>>>>>>>> you the
>>>>>>>>> changes.
>>>>>>>>>
>>>>>>>>> http://home.snafu.de/tilman/tmp/pdfbox-app-2.0.0-TILMAN.jar
>>>>>>>>>
>>>>>>>>> Don't use that version for production. It contains lots
of stuff
>>>>>>>>> for my
>>>>>>>>> own tests. Only use it for this problem. Here's the output
that
>>>>>>>>> you
>>>>>>>>> should get:
>>>>>>>>>
>>>>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
>>>>>>>>> parseXrefStream
>>>>>>>>> INFORMATION: parseXrefStream: objByteOffset = 116
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 7 0 obj at offset:
16
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 8 0 obj at offset:
573
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 9 0 obj at offset:
633
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 10 0 obj at offset:
817
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 11 0 obj at offset:
914
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 12 0 obj at offset:
116
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 13 0 obj at offset:
436
>>>>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
>>>>>>>>> parseXrefStream
>>>>>>>>> INFORMATION: parseXrefStream: objByteOffset = 363505
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 1 0 obj at offset:
359638
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 2 0 obj at offset:
363167
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 3 0 obj at offset:
363307
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 4 0 obj at offset:
363505
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 5 stmnr: 2
>>>>>>>>> Feb 10, 2015 9:27:18 PM
>>>>>>>>> org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>>>>> parse
>>>>>>>>> INFORMATION: PDFXrefStreamParser: 6 stmnr: 3
>>>>>>>>>
>>>>>>>>> What I wonder is if the offsets will be the same.
>>>>>>>>>
>>>>>>>>> Tilman
>>>>>>>>>
>>>>>>>>> PS: Sorry I usually can't help during EU business hours.
Day job
>>>>>>>>> :-)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Am 09.02.2015 um 11:26 schrieb Kevin Morin:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I will probably have to migrate to java 8 because
of a bug in
>>>>>>>>>> java 7
>>>>>>>>>> which throws an error when rendering a certain type
of PDF (cf
>>>>>>>>>> thread
>>>>>>>>>> Error on PDFRenderer.renderImage (PDFBox 2.0)). Could
someone
>>>>>>>>>> please
>>>>>>>>>> check why it is not working on Windows Server 2008
R2 Standard?
>>>>>>>>>> If you
>>>>>>>>>> do not have this OS, tell me what I can do to help
you.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> BR
>>>>>>>>>>
>>>>>>>>>> Kevin
>>>>>>>>>>
>>>>>>>>>>> On 21/01/2015 12:26, Andreas Lehmkühler wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>> Kevin Morin <morin@codelutin.com> hat
am 21. Januar 2015 um
>>>>>>>>>>>> 12:14
>>>>>>>>>>>> geschrieben:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I thought I was running java 7 but it's java
8... I tried
>>>>>>>>>>>> with java 7
>>>>>>>>>>>> and it works. I do not need it to work with
java 8, java 7 is
>>>>>>>>>>>> ok for
>>>>>>>>>>>> me.
>>>>>>>>>>> It works for me using java 8 on win7 and linux
as well. I
>>>>>>>>>>> guess, the
>>>>>>>>>>> issue has
>>>>>>>>>>> to be something else....
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> BR
>>>>>>>>>>> Andreas Lehmkühler
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your help and for all your work.
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin
>>>>>>>>>>>>
>>>>>>>>>>>>> On 21/01/2015 11:54, Maruan Sahyoun wrote:
>>>>>>>>>>>>> Hi Kevin
>>>>>>>>>>>>>
>>>>>>>>>>>>> works for me - what's your Java Version?
>>>>>>>>>>>>>
>>>>>>>>>>>>> BR
>>>>>>>>>>>>> Maruan
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 21.01.2015 um 11:24 schrieb Kevin
Morin
>>>>>>>>>>>>>> <morin@codelutin.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it does not work with PDFToImage
either, I still get a blank
>>>>>>>>>>>>>> image. Plus, I
>>>>>>>>>>>>>> did not set the nonSeq option however
it seems to be using
>>>>>>>>>>>>>> the non
>>>>>>>>>>>>>> sequential parser. And I have the
following traces:
>>>>>>>>>>>>>> janv. 21, 2015 11:20:02 AM
>>>>>>>>>>>>>> org.apache.pdfbox.pdfparser.NonSequentialPDFParser
ch
>>>>>>>>>>>>>> eckXrefOffsets
>>>>>>>>>>>>>> GRAVE: Can't find the object 7 0
(origin offset 359138)
>>>>>>>>>>>>>> janv. 21, 2015 11:20:03 AM
>>>>>>>>>>>>>> org.apache.pdfbox.contentstream.PDFStreamEngine
>>>>>>>>>>>>>> opera
>>>>>>>>>>>>>> torException
>>>>>>>>>>>>>> GRAVE: Missing XObject: Im1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 21/01/2015 11:11, Maruan Sahyoun
wrote:
>>>>>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> you can test with the PDFToImage
command [1] available in
>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>> pdfbox-app [2] if the issue happens
there. The source for
>>>>>>>>>>>>>>> PDFToImage is
>>>>>>>>>>>>>>> available in the tools section
of the SVN repo or online
>>>>>>>>>>>>>>> viewable
>>>>>>>>>>>>>>> [3].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>>> Maruan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://pdfbox.apache.org/1.8/commandline.html#pdfToImage
>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [3]
>>>>>>>>>>>>>>> http://svn.apache.org/viewvc/pdfbox/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFToImage.java?view=markup
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am 21.01.2015 um 11:00 schrieb
Kevin Morin
>>>>>>>>>>>>>>>> <morin@codelutin.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Andreas,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am using the latest snapshot
available on the maven
>>>>>>>>>>>>>>>> repository. And I
>>>>>>>>>>>>>>>> am running my app on Windows
Server 2008 R2 Standard and
>>>>>>>>>>>>>>>> it does
>>>>>>>>>>>>>>>> not work
>>>>>>>>>>>>>>>> (white page). Could send
me the code or a jar to test on
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> server to
>>>>>>>>>>>>>>>> check if it does not come
from my code?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 19/01/2015 19:13,
Andreas Lehmkuehler wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am 19.01.2015 um
12:45 schrieb Kevin Morin:
>>>>>>>>>>>>>>>>>> Actually, the issue
is not only these traces. The real
>>>>>>>>>>>>>>>>>> issue
>>>>>>>>>>>>>>>>>> is that I
>>>>>>>>>>>>>>>>>> have a
>>>>>>>>>>>>>>>>>> blank image when
I try to render the document.
>>>>>>>>>>>>>>>>> I've checked your PDF
and everything renders fine. I've
>>>>>>>>>>>>>>>>> tried
>>>>>>>>>>>>>>>>> SNAPSHOT-891 on linux
(running java 1.8, 1.7 and 1.6)
>>>>>>>>>>>>>>>>> and the
>>>>>>>>>>>>>>>>> latest
>>>>>>>>>>>>>>>>> SNAPSHOT-947 on win7
running java 1.7
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Maybe your SNAPSHOT is
outdated?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>>>>> Andreas Lehmkühler
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 19/01/2015
12:39, Kevin Morin wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am using the
2.0 snapshot version to images of pdfs,
>>>>>>>>>>>>>>>>>>> but on
>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>> documents, I
have the following error when I call
>>>>>>>>>>>>>>>>>>> PDDocument.load(file):
>>>>>>>>>>>>>>>>>>> 2015/01/19 12:32:48
ERROR
>>>>>>>>>>>>>>>>>>> (org.apache.pdfbox.pdfparser.NonSequentialPDFParser:1864)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>>>> Can't find
>>>>>>>>>>>>>>>>>>> the object 7
0 (origin offset 359138)
>>>>>>>>>>>>>>>>>>> 2015/01/19 12:32:48
ERROR
>>>>>>>>>>>>>>>>>>> (org.apache.pdfbox.contentstream.PDFStreamEngine:840)
-
>>>>>>>>>>>>>>>>>>> Missing
>>>>>>>>>>>>>>>>>>> XObject:
>>>>>>>>>>>>>>>>>>> Im1
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I first had it
a few days ago (I did not report it,
>>>>>>>>>>>>>>>>>>> shame on
>>>>>>>>>>>>>>>>>>> me) but
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> error did not
occur when I called the loadLegacy
>>>>>>>>>>>>>>>>>>> method on
>>>>>>>>>>>>>>>>>>> PDDocument.
>>>>>>>>>>>>>>>>>>> But the loadLegacy
method is not available anymore...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The issue happens
on Windows (works fine on Debian).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks fo your
help
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kevin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message