Return-Path: X-Original-To: apmail-pdfbox-dev-archive@www.apache.org Delivered-To: apmail-pdfbox-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A6C09E60 for ; Fri, 9 Mar 2012 07:12:20 +0000 (UTC) Received: (qmail 51577 invoked by uid 500); 9 Mar 2012 07:12:20 -0000 Delivered-To: apmail-pdfbox-dev-archive@pdfbox.apache.org Received: (qmail 51555 invoked by uid 500); 9 Mar 2012 07:12:19 -0000 Mailing-List: contact dev-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pdfbox.apache.org Delivered-To: mailing list dev@pdfbox.apache.org Received: (qmail 51527 invoked by uid 99); 9 Mar 2012 07:12:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 07:12:19 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 07:12:17 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 71073116F7 for ; Fri, 9 Mar 2012 07:11:57 +0000 (UTC) Date: Fri, 9 Mar 2012 07:11:57 +0000 (UTC) From: =?utf-8?Q?Andreas_Lehmk=C3=BChler_=28Commented=29_=28JIRA=29?= To: dev@pdfbox.apache.org Message-ID: <871541213.42562.1331277117464.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1739338213.5666.1329829594316.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (PDFBOX-1232) FlateDecoder in stream mode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/PDFBOX-1232?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D132= 25903#comment-13225903 ]=20 Andreas Lehmk=C3=BChler commented on PDFBOX-1232: -------------------------------------------- Just to be sure, what is the expected result of the decompression of the gi= ven 5 value stream? =20 > FlateDecoder in stream mode > --------------------------- > > Key: PDFBOX-1232 > URL: https://issues.apache.org/jira/browse/PDFBOX-1232 > Project: PDFBox > Issue Type: Bug > Reporter: Dave Smith > > The zlib (the unlying spec for Flate compression) does not require an Z_S= TREAM_END to terminate the compression. The Java InflateInputStream is real= ly assuming that you are reading a zip or gzip file which will always have = a Z_STREAM_END (Z_STREAM_END is a constant in the zlib library which Java c= alls natively) . So the following chunk decodes fine using the jcraft zlib= decoder, but fails using the InflateInputStream. > 3 0 obj > << > /Type /XObject > /Subtype /Form > /FormType 1 > /Resources << /Font 4 0 R > /ProcSet [/PDF /ImageC /Text]>> > /BBox [0 0 595 842] > /Matrix [1 0 0 1 0 0] > /Filter /FlateDecode > /Length 5 >> > stream > H<89>^C^@ > endstream > endobj > The blob is 72, -119, 3, 0, 13 decimal. It decodes to an empty string. > The fix is to use Inflater and check to see if it has consumed all of the= input buffer and make sure it has nothing to write into the output buffer. > protected ByteArrayOutputStream decompress(InputStream in) > throws IOException, DataFormatException > { > ByteArrayOutputStream out =3D new ByteArrayOutputStream(); > byte buf[] =3D new byte[1000]; > Inflater inflater =3D new Inflater(); > int read =3D in.read(buf); > if(read =3D=3D 0) > { > return out; > } > inflater.setInput(buf,0,read); > byte res[] =3D new byte[1000]; > while(true) > { > int resRead =3D inflater.inflate(res); > if(resRead !=3D0) > { > out.write(res,0,resRead); > continue; > } > if(inflater.finished() || inflater.needsDictionary() || in= .available()=3D=3D0) > { > out.close(); > return out; > } > read =3D in.read(buf); > inflater.setInput(buf,0,read); > =20 > } > } > We then need to change FlateFilter.decode(InputStream compressedData, Out= putStream result, > COSDictionary options, int filterIndex ) > to look like ... > if (compressedData.available() > 0) > { > try > { > baos =3D decompress(compressedData); > } > if (predictor=3D=3D-1 || predictor =3D=3D 1 ) > { > result.write(baos.toByteArray()); > } > else > { > use the bytearrayoutput stream as before ... > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira