pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takashi Komatsubara (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PDFBOX-361) NullPointerException in PDPageNode.getAllKids
Date Mon, 09 Feb 2009 09:19:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671774#action_12671774
] 

Takashi Komatsubara commented on PDFBOX-361:
--------------------------------------------

Hello, 

I have just got the latest code and confirm that James's change is working well.

Here is the part which we have to change.

PDFParse.java ------------------ ( begin ) 

            else if( !pdfSource.isEOF() )
            {

                // PDF Spec 1.5 introduced "Cross Reference Streams"
                // There can be multiple "%%EOF" strings in the file
                return parseObject ();

/************************
                //we might really be at the end of the file, there might just be some crap
at the
                //end of the file.
                pdfSource.fillBuffer();
                if( pdfSource.available() < 1000 )
                {
                    //We need to determine if we are at the end of the file.
                    byte[] data = new byte[ 1000 ];

                    int amountRead = pdfSource.read( data );
                    if( amountRead != -1 )
                    {
                        pdfSource.unread( data, 0, amountRead );
                    }
                    boolean atEndOfFile = true;//we assume yes unless we find another.
                    for( int i=0; i<amountRead-3 && atEndOfFile; i++ )
                    {
                        atEndOfFile = !(data[i] == 'E' &&
                                        data[i+1] == 'O' &&
                                        data[i+2] == 'F' );
                    }
                    if( atEndOfFile )
                    {
                        while( pdfSource.read( data, 0, data.length ) != -1 )
                        {
                            //read until done.
                        }
                    }
                }
***************************/
(End)

Takashi.

> NullPointerException in PDPageNode.getAllKids
> ---------------------------------------------
>
>                 Key: PDFBOX-361
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-361
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>            Reporter: Jukka Zitting
>         Attachments: Long_9.pdf, PDFParser.java
>
>
> [Issue from SourceForge]
> http://sourceforge.net/tracker/index.php?func=detail&aid=2008371&group_id=78314&atid=552832
> The parser cannot seem to find the Pages object in files created with
> Acrobat Pro 9. A sample file is attached.
> public static void main(String[] argv) throws Exception {
> String name = "./test.pdf";
> PDDocument doc = PDDocument.load(name);
> doc.close();
> PDPageNode root = doc.getDocumentCatalog().getPages();
> ArrayList<PDPage> pages = new ArrayList<PDPage>();
> root.getAllKids(pages);
> System.out.println("pages.size() == "+pages.size());
> }
> Exception in thread "main" java.lang.NullPointerException
> at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
> at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&file_id=283367&aid=2008371
> [Comment on SourceForge]
> Date: 2008-07-02 00:57
> Sender: foundart
> Logged In: YES 
> user_id=1693709
> Originator: YES
> This happens with the latest code from CVS and also in older versions.
> [Comment on SourceForge]
> Date: 2008-07-14 17:25
> Sender: orthello
> Logged In: YES 
> user_id=853566
> Originator: NO
> We are experiencing the same problem.  Offending pdf available if any of
> you need it (jwilson@nmcourt.fed.us).  Looks like pdfbox does not support
> some new feature introduced in Acrobat 9.
> [Comment on SourceForge]
> Date: 2008-07-14 23:20
> Sender: foundart
> Logged In: YES 
> user_id=1693709
> Originator: YES
> In Acrobat 8, the default was to generate PDFs following version 1.4 of
> the PDF specification.  In Acrobat 9, the default is to to generate PDFs
> following version 1.5 of the PDF specification.  PDF1.5 has objects known
> as cross-reference streams and it turns out that PDFBox does not parse them
> correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message