pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Error on PDDocument.load
Date Wed, 11 Feb 2015 17:54:43 GMT
No, his file is confidential.

However we might create a non confidential file that has the same error.

Tilman

Am 11.02.2015 um 18:40 schrieb John Hewson:
> Can we get a JIRA issue open for this, preferably with the file attached?
>
> -- John
>
>> On 11 Feb 2015, at 00:29, Tilman Hausherr <THausherr@t-online.de> wrote:
>>
>> Yes, they made hacks. So did we, for many types of malformed files. Please send the
file also to Andreas, unless you already did, he did many workarounds for malformed files.
>>
>> Tilman
>>
>>> Am 11.02.2015 um 09:05 schrieb Kevin Morin:
>>> Ok. Why other softwares are able to open it (like xpf)? I guess they made a hack
to fix this? Are you going to do something too?
>>>
>>> Thanks
>>> BR
>>>
>>> Kevin
>>>
>>>> On 11/02/2015 08:53, Tilman Hausherr wrote:
>>>> Hi,
>>>>
>>>> I can reproduce the error. Your file is malformed. Please open it with
>>>> NOTEPAD++ and go to the end:
>>>>
>>>> xref
>>>> 1 7
>>>> 0000000000 65535 f
>>>> 0000000009 00000 n
>>>> 0000358745 00000 n
>>>> 0000358842 00000 n
>>>> 0000359029 00000 n
>>>> 0000359087 00000 n
>>>> 0000359138 00000 n
>>>> trailer
>>>>
>>>> The first number (1) means the number of the first object. So it would
>>>> be 1. The second number(7) is the size of the table. The number 1 is
>>>> incorrect, it should be 0, because "0000000000 65535 f" is the dummy
>>>> object 0. Press CTRL-G and enter the offsets (e.g. 9, 45, 358745, ...)
>>>> and you will see what I mean.
>>>>
>>>>  From the pdf spec:
>>>>
>>>> The free entries in the cross-reference table form a linked list, with
>>>> each free entry containing the object number of the next. The first
>>>> entry in the table (object number 0) is always free and has a generation
>>>> number of 65,535; it is the head of the linked list of free objects
>>>>
>>>> Tilman
>>>>
>>>>
>>>>> Am 11.02.2015 um 08:21 schrieb Kevin Morin:
>>>>> Hi,
>>>>>
>>>>> I am sorry, it seems that I did not send you the right file...
>>>>> Actually, I was testing the wrong file on linux from the begining
>>>>> also. The file is displaying blank also on linux and on java 7 or 8...
>>>>> Here is the right file.
>>>>>
>>>>> I am sorry to make you work for nothing...
>>>>>
>>>>> BR
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>>> On 10/02/2015 21:32, Tilman Hausherr wrote:
>>>>>> So we e-mailed and the result is
>>>>>> - you're really working on W2008 with the file that you sent me
>>>>>> - you get the same error on W2008 with the app (and I don't)
>>>>>>
>>>>>> I have analysed that file and did some debug traces. If loading that
on
>>>>>> W2008 is a no-no, you'd have to build from source and I'll tell you
the
>>>>>> changes.
>>>>>>
>>>>>> http://home.snafu.de/tilman/tmp/pdfbox-app-2.0.0-TILMAN.jar
>>>>>>
>>>>>> Don't use that version for production. It contains lots of stuff
for my
>>>>>> own tests. Only use it for this problem. Here's the output that you
>>>>>> should get:
>>>>>>
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
>>>>>> parseXrefStream
>>>>>> INFORMATION: parseXrefStream: objByteOffset = 116
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 7 0 obj at offset: 16
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 8 0 obj at offset: 573
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 9 0 obj at offset: 633
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 10 0 obj at offset: 817
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 11 0 obj at offset: 914
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 12 0 obj at offset: 116
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 13 0 obj at offset: 436
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
>>>>>> parseXrefStream
>>>>>> INFORMATION: parseXrefStream: objByteOffset = 363505
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 1 0 obj at offset: 359638
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 2 0 obj at offset: 363167
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 3 0 obj at offset: 363307
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 4 0 obj at offset: 363505
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 5 stmnr: 2
>>>>>> Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser
>>>>>> parse
>>>>>> INFORMATION: PDFXrefStreamParser: 6 stmnr: 3
>>>>>>
>>>>>> What I wonder is if the offsets will be the same.
>>>>>>
>>>>>> Tilman
>>>>>>
>>>>>> PS: Sorry I usually can't help during EU business hours. Day job
:-)
>>>>>>
>>>>>>
>>>>>>> Am 09.02.2015 um 11:26 schrieb Kevin Morin:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I will probably have to migrate to java 8 because of a bug in
java 7
>>>>>>> which throws an error when rendering a certain type of PDF (cf
thread
>>>>>>> Error on PDFRenderer.renderImage (PDFBox 2.0)). Could someone
please
>>>>>>> check why it is not working on Windows Server 2008 R2 Standard?
If you
>>>>>>> do not have this OS, tell me what I can do to help you.
>>>>>>>
>>>>>>> Thanks
>>>>>>> BR
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>> On 21/01/2015 12:26, Andreas Lehmkühler wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> Kevin Morin <morin@codelutin.com> hat am 21. Januar
2015 um 12:14
>>>>>>>>> geschrieben:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I thought I was running java 7 but it's java 8... I tried
with java 7
>>>>>>>>> and it works. I do not need it to work with java 8, java
7 is ok for
>>>>>>>>> me.
>>>>>>>> It works for me using java 8 on win7 and linux as well. I
guess, the
>>>>>>>> issue has
>>>>>>>> to be something else....
>>>>>>>>
>>>>>>>>
>>>>>>>> BR
>>>>>>>> Andreas Lehmkühler
>>>>>>>>
>>>>>>>>> Thanks for your help and for all your work.
>>>>>>>>>
>>>>>>>>> Kevin
>>>>>>>>>
>>>>>>>>>> On 21/01/2015 11:54, Maruan Sahyoun wrote:
>>>>>>>>>> Hi Kevin
>>>>>>>>>>
>>>>>>>>>> works for me - what's your Java Version?
>>>>>>>>>>
>>>>>>>>>> BR
>>>>>>>>>> Maruan
>>>>>>>>>>
>>>>>>>>>>> Am 21.01.2015 um 11:24 schrieb Kevin Morin <morin@codelutin.com>:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> it does not work with PDFToImage either, I still
get a blank
>>>>>>>>>>> image. Plus, I
>>>>>>>>>>> did not set the nonSeq option however it seems
to be using the non
>>>>>>>>>>> sequential parser. And I have the following traces:
>>>>>>>>>>> janv. 21, 2015 11:20:02 AM
>>>>>>>>>>> org.apache.pdfbox.pdfparser.NonSequentialPDFParser
ch
>>>>>>>>>>> eckXrefOffsets
>>>>>>>>>>> GRAVE: Can't find the object 7 0 (origin offset
359138)
>>>>>>>>>>> janv. 21, 2015 11:20:03 AM
>>>>>>>>>>> org.apache.pdfbox.contentstream.PDFStreamEngine
>>>>>>>>>>> opera
>>>>>>>>>>> torException
>>>>>>>>>>> GRAVE: Missing XObject: Im1
>>>>>>>>>>>
>>>>>>>>>>> BR
>>>>>>>>>>>
>>>>>>>>>>> Kevin
>>>>>>>>>>>
>>>>>>>>>>>> On 21/01/2015 11:11, Maruan Sahyoun wrote:
>>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>>
>>>>>>>>>>>> you can test with the PDFToImage command
[1] available in from the
>>>>>>>>>>>> pdfbox-app [2] if the issue happens there.
The source for
>>>>>>>>>>>> PDFToImage is
>>>>>>>>>>>> available in the tools section of the SVN
repo or online viewable
>>>>>>>>>>>> [3].
>>>>>>>>>>>>
>>>>>>>>>>>> BR
>>>>>>>>>>>> Maruan
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://pdfbox.apache.org/1.8/commandline.html#pdfToImage
>>>>>>>>>>>> [2]
>>>>>>>>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [3]
>>>>>>>>>>>> http://svn.apache.org/viewvc/pdfbox/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFToImage.java?view=markup
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Am 21.01.2015 um 11:00 schrieb Kevin
Morin <morin@codelutin.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Andreas,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am using the latest snapshot available
on the maven
>>>>>>>>>>>>> repository. And I
>>>>>>>>>>>>> am running my app on Windows Server 2008
R2 Standard and it does
>>>>>>>>>>>>> not work
>>>>>>>>>>>>> (white page). Could send me the code
or a jar to test on this
>>>>>>>>>>>>> server to
>>>>>>>>>>>>> check if it does not come from my code?
>>>>>>>>>>>>>
>>>>>>>>>>>>> BR
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 19/01/2015 19:13, Andreas Lehmkuehler
wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am 19.01.2015 um 12:45 schrieb
Kevin Morin:
>>>>>>>>>>>>>>> Actually, the issue is not only
these traces. The real issue
>>>>>>>>>>>>>>> is that I
>>>>>>>>>>>>>>> have a
>>>>>>>>>>>>>>> blank image when I try to render
the document.
>>>>>>>>>>>>>> I've checked your PDF and everything
renders fine. I've tried
>>>>>>>>>>>>>> SNAPSHOT-891 on linux (running java
1.8, 1.7 and 1.6) and the
>>>>>>>>>>>>>> latest
>>>>>>>>>>>>>> SNAPSHOT-947 on win7 running java
1.7
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe your SNAPSHOT is outdated?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>> Andreas Lehmkühler
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 19/01/2015 12:39, Kevin
Morin wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am using the 2.0 snapshot
version to images of pdfs, but on
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>> documents, I have the following
error when I call
>>>>>>>>>>>>>>>> PDDocument.load(file):
>>>>>>>>>>>>>>>> 2015/01/19 12:32:48 ERROR
>>>>>>>>>>>>>>>> (org.apache.pdfbox.pdfparser.NonSequentialPDFParser:1864)
-
>>>>>>>>>>>>>>>> Can't find
>>>>>>>>>>>>>>>> the object 7 0 (origin offset
359138)
>>>>>>>>>>>>>>>> 2015/01/19 12:32:48 ERROR
>>>>>>>>>>>>>>>> (org.apache.pdfbox.contentstream.PDFStreamEngine:840)
-
>>>>>>>>>>>>>>>> Missing
>>>>>>>>>>>>>>>> XObject:
>>>>>>>>>>>>>>>> Im1
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I first had it a few days
ago (I did not report it, shame on
>>>>>>>>>>>>>>>> me) but
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> error did not occur when
I called the loadLegacy method on
>>>>>>>>>>>>>>>> PDDocument.
>>>>>>>>>>>>>>>> But the loadLegacy method
is not available anymore...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The issue happens on Windows
(works fine on Debian).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks fo your help
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kevin
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message