pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Smith <dave.sm...@candata.com>
Subject Problems With FlateDecoder
Date Sun, 19 Feb 2012 03:51:01 GMT
I am having some problems with certain pdf docs and the FlateDecoder.
So for example

A chunk like this ...
3 0 obj
<<
/Type /XObject
/Subtype /Form
/FormType 1
/Resources << /Font 4 0 R
/ProcSet [/PDF /ImageC /Text]>>
/BBox [0 0 595 842]
/Matrix [1 0 0 1 0 0]
/Filter /FlateDecode
/Length 5 >>
stream
H<89>^C^@
endstream
endobj

The blob is 72, -119, 3, 0, 13 decimal.

Now if I run it through the jcraft zlib decoder it works (it is an
empty string but that is beside the point) in latest trunk it throws
an end of data exception.  The problem is that the decode chunk ends
without a terminating bit in the stream and thus the EOF. According to
the deflate spec it is not required so I would consider this a bug on
the Java InflateInputStream.


I recoded the decoder and it seems to work in all my testcases where I
had zlib streams with and without the Z_STREAM_END set. The code is
below ...



 protected ByteArrayOutputStream decompress(InputStream in)
    	throws IOException, DataFormatException
    {
    	ByteArrayOutputStream out = new ByteArrayOutputStream();
    	byte buf[] = new byte[1000];
    	Inflater inflater = new Inflater();
    	int read = in.read(buf);
    	if(read == 0)
    	{
    		return out;
    	}
    	inflater.setInput(buf,0,read);
    	byte res[] = new byte[1000];
    	while(true)
    	{
    		int resRead = inflater.inflate(res);
    		if(resRead !=0)
    		{
    			out.write(res,0,resRead);
    			continue;
    		}
    		if(inflater.finished() || inflater.needsDictionary() ||
(inflater.needsInput() && in.available()==0))
    		{
    			out.close();
    			return out;
    		}
    		if(inflater.needsInput())
    		{
    			read = in.read(buf);
    			inflater.setInput(buf,0,read);
    		}
    	}
    }


and then
FlateFilter.decode(InputStream compressedData, OutputStream result,
COSDictionary options, int filterIndex )

looks like


 if (compressedData.available() > 0)
            {
            	try
            	{
            		baos =  decompress(compressedData);
            	}
if (predictor==-1 || predictor == 1 )
                {
                   result.write(baos.toByteArray());
                }
else
{
 use the bytearrayoutput stream as before ...
}


Thoughts ?

Dave Smith
Candata Ltd.

Mime
View raw message