pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antoni Mylka (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-1075) Can't get images from a PDF
Date Fri, 12 Aug 2011 11:08:27 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084046#comment-13084046
] 

Antoni Mylka commented on PDFBOX-1075:
--------------------------------------

Narrowed the problem further.

AFAIU:

In PDPixelMap.getImage the basic algorithm is:
1. get the color model
2. create a compatible WritableRaster
3. copy the bytes from the PDF to the buffer of the raster
4. initialize a buffered image with the color model and the raster

Only part 1. changed in r1133281. Parts 2,3,4 are the same.

In the old version, the color model returned would be: (taken from toString())
IndexColorModel: #pixelBits = 1 numComponents = 4 color space = java.awt.color.ICC_ColorSpace@7736bd
transparency = 2 transIndex   = 1 has alpha = true isAlphaPre = false

In the new version:
IndexColorModel: #pixelBits = 1 numComponents = 3 color space = java.awt.color.ICC_ColorSpace@1bac748
transparency = 1 transIndex   = -1 has alpha = false isAlphaPre = false

The new version has only 3 components(no alpha channel, no transparency). The old version
included the alpha channel and the transparency.

Now the real problem seems to have surfaced when I did something like this:

WritableRaster raster = cm.createCompatibleWritableRaster( width, height );            
if (!cm.isCompatibleRaster(raster)) {
        System.out.println("Color model created an incompatible raster");
}

So it seems that in the new version, for this pdf, the createCompatibleWritableRaster method
returns rasters which are NOT compatible. 

> Can't get images from a PDF
> ---------------------------
>
>                 Key: PDFBOX-1075
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1075
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.6.0, 1.7.0
>            Reporter: Antoni Mylka
>
> This is a regression. In 1.4.0 I was able to extract images from a PDF file. In 1.6 and
the current trunk I get exceptions:
> SEVERE: java.lang.IllegalArgumentException: Raster BytePackedRaster: width = 1000 height
= 32 #channels 1 xOff = 0 yOff = 0 is incompatible with ColorModel IndexColorModel: #pixelBits
= 1 numComponents = 3 color space = java.awt.color.ICC_ColorSpace@1050169 transparency = 1
transIndex   = -1 has alpha = false isAlphaPre = false
> java.lang.IllegalArgumentException: Raster BytePackedRaster: width = 1000 height = 32
#channels 1 xOff = 0 yOff = 0 is incompatible with ColorModel IndexColorModel: #pixelBits
= 1 numComponents = 3 color space = java.awt.color.ICC_ColorSpace@1050169 transparency = 1
transIndex   = -1 has alpha = false isAlphaPre = false
> 	at java.awt.image.BufferedImage.<init>(BufferedImage.java:611)
> 	at org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap.getRGBImage(PDPixelMap.java:252)
> 	at org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap.write2OutputStream(PDPixelMap.java:289)
> 	at org.apache.pdfbox.TestGovdocs148902.test148902(TestGovdocs148902.java:58)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at junit.framework.TestCase.runTest(TestCase.java:168)
> 	at junit.framework.TestCase.runBare(TestCase.java:134)
> 	at junit.framework.TestResult$1.protect(TestResult.java:110)
> 	at junit.framework.TestResult.runProtected(TestResult.java:128)
> 	at junit.framework.TestResult.run(TestResult.java:113)
> 	at junit.framework.TestCase.run(TestCase.java:124)
> 	at junit.framework.TestSuite.runTest(TestSuite.java:232)
> 	at junit.framework.TestSuite.run(TestSuite.java:227)
> 	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> 	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
> 	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> http://domex.nps.edu/corp/files/govdocs1/148/148902.pdf
> I pasted it into the src/test/resources/pdfparser folder and run a test case like this:
> public class TestGovdocs148902 extends TestCase
> {
>     public void test148902() throws IOException {
>         PDDocument doc = PDDocument.load( "src/test/resources/pdfparser/148902.pdf");
>         int imageCounter = 0;
>         COSDocument cosDoc = doc.getDocument();
>         List<COSObject> list = cosDoc.getObjectsByType(COSName.XOBJECT);
>         
>         for (COSObject cosOb : list) {
>             COSBase baseObject = cosOb.getObject();
>             if (baseObject != null && baseObject instanceof COSStream) {
>                 COSStream st = (COSStream)baseObject;
>                 String subtype = st.getNameAsString(COSName.SUBTYPE);
>                 if (subtype != null && subtype.equalsIgnoreCase("image")) {
>                     PDXObjectImage ximage = (PDXObjectImage)PDXObject.createXObject(
st );
>                     if (ximage != null && ximage.getWidth() >= 5 &&
ximage.getHeight() >= 5) {
>                         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>                         ximage.write2OutputStream(baos);
>                         byte [] bytes = baos.toByteArray();
>                         if (bytes.length > 0) {
>                             imageCounter++;
>                         }
>                     }
>                 }
>             }
>         }
>         
>         assertEquals(32, imageCounter);
>     }
> }
> The test cases passes in 1.4.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message