Return-Path: Delivered-To: apmail-incubator-pdfbox-dev-archive@minotaur.apache.org Received: (qmail 61557 invoked from network); 4 Feb 2009 14:44:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Feb 2009 14:44:20 -0000 Received: (qmail 72288 invoked by uid 500); 4 Feb 2009 14:44:20 -0000 Delivered-To: apmail-incubator-pdfbox-dev-archive@incubator.apache.org Received: (qmail 72279 invoked by uid 500); 4 Feb 2009 14:44:20 -0000 Mailing-List: contact pdfbox-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: pdfbox-dev@incubator.apache.org Delivered-To: mailing list pdfbox-dev@incubator.apache.org Received: (qmail 72268 invoked by uid 99); 4 Feb 2009 14:44:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Feb 2009 06:44:20 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Feb 2009 14:44:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E303C234C48C for ; Wed, 4 Feb 2009 06:43:59 -0800 (PST) Message-ID: <337480080.1233758639928.JavaMail.jira@brutus> Date: Wed, 4 Feb 2009 06:43:59 -0800 (PST) From: "Timo Boehme (JIRA)" To: pdfbox-dev@incubator.apache.org Subject: [jira] Updated: (PDFBOX-418) PDFStreamParser reads incorrect number (patch provided) In-Reply-To: <1405557537.1233758282289.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PDFBOX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Boehme updated PDFBOX-418: ------------------------------- Description: With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception. The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should only be allowed at fist position (maybe one should make sure that '.' can only be read once). The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. StringBuffer buf = new StringBuffer(); buf.append( c ); pdfSource.read(); boolean dotNotRead = (c != '.'); while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) ) { buf.append( c ); pdfSource.read(); if (dotNotRead && (c == '.')) dotNotRead = false; } retval = COSNumber.get( buf.toString() ); break; was: With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception. The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should only be allowed at fist position (maybe one should make sure that '.' can only be read once). StringBuffer buf = new StringBuffer(); buf.append( c ); pdfSource.read(); boolean dotNotRead = (c != '.'); while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) ) { buf.append( c ); pdfSource.read(); if (dotNotRead && (c == '.')) dotNotRead = false; } retval = COSNumber.get( buf.toString() ); break; > PDFStreamParser reads incorrect number (patch provided) > ------------------------------------------------------- > > Key: PDFBOX-418 > URL: https://issues.apache.org/jira/browse/PDFBOX-418 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 0.8.0-incubator > Reporter: Timo Boehme > Fix For: 0.8.0-incubator > > Original Estimate: 0.08h > Remaining Estimate: 0.08h > > With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception. > The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read > was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should > only be allowed at fist position (maybe one should make sure that '.' can only be read once). > The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not > necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. > StringBuffer buf = new StringBuffer(); > > buf.append( c ); > pdfSource.read(); > > boolean dotNotRead = (c != '.'); > > while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) ) > { > buf.append( c ); > pdfSource.read(); > > if (dotNotRead && (c == '.')) > dotNotRead = false; > } > retval = COSNumber.get( buf.toString() ); > break; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.