pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: PDFBOX 2 scanned documents
Date Wed, 23 Sep 2015 18:44:45 GMT
Am 23.09.2015 um 20:39 schrieb Tim Daley:
> Whoops! I don't see PDFDebugger in PDFBox 2. Oversight? I'll get it out of
> Version 1.

You can't attach PDF files. Upload them somewhere.

PDFDebugger is there, I even made a change earlier today! It is part of 
the PDFBox app jar.

Tilman



>
> On Wed, Sep 23, 2015 at 2:35 PM, Tim Daley <tim.daley@cru.org> wrote:
>
>> The PDF is at the bottom of the email. Aha! PDFDebugger!
>>
>> On Wed, Sep 23, 2015 at 1:31 PM, Tilman Hausherr <THausherr@t-online.de>
>> wrote:
>>
>>> The XObjects should be the same count in version 1 and 2.
>>>
>>> If you don't want to share the PDFs, then look at them with the new
>>> PDFDebugger. You can see the XObject images easily.
>>>
>>> Tilman
>>>
>>> Am 23.09.2015 um 19:21 schrieb Tim Daley:
>>>
>>>> Here's the basic code that used to work. Granted, it probably depends
>>>> heavily on Version 1's structure.
>>>>
>>>>
>>>> PDPage pdPage = CFCAPDFInputProgressBar.this.pdPages.get(i);
>>>>
>>>> Map<COSName, PDXObject> images = new TreeMap<COSName, PDXObject>();
>>>>
>>>> PDResources pdResources = pdPage.getResources();
>>>>
>>>> for(Entry<COSName, PDXObject> objectImageEntry:images.entrySet())
>>>>
>>>> {
>>>>
>>>>    PDXObject pdXObject = objectImageEntry.getValue();
>>>>
>>>>    if (pdXObject instanceof PDImageXObject)
>>>>
>>>>    {
>>>>
>>>>      PDImageXObject pdXObjectImage= ((PDImageXObject)pdXObject);
>>>>
>>>>      BufferedImage bufferedImage = null;
>>>>
>>>>      try{bufferedImage= pdXObjectImage.getImage();}
>>>>
>>>> catch(Throwable t)
>>>>
>>>>      {
>>>>
>>>>        t.printStackTrace();
>>>>
>>>>        randomAccessFile.close();
>>>>
>>>>        throw new RuntimeException(t);
>>>>
>>>>      }
>>>>
>>>>      if (CFCAPDFInputProgressBar.this.music.getLandscape())
>>>>
>>>>        bufferedImage= rotate90DX(bufferedImage);
>>>>
>>>> int width = bufferedImage.getWidth();
>>>>
>>>> int height = bufferedImage.getHeight();
>>>>
>>>> if (CFCAPDFInputProgressBar.this.music.getTwoPage())
>>>>
>>>>      {
>>>>
>>>>        width /= 2;
>>>>
>>>>        boolean even = i%2 == 0;
>>>>
>>>>        intrightPageNo= even?i+1:pageCount*2-i;
>>>>
>>>>        intleftPageNo= even?pageCount*2-i:i+1;
>>>>
>>>>        putPage(bufferedImage, rightPageNo, width, 0, width, height);
>>>>
>>>>        putPage(bufferedImage, leftPageNo, 0, 0, width, height);
>>>>
>>>>      }
>>>>
>>>> else
>>>>
>>>>      {
>>>>
>>>>        int pageNo = CFCAPDFInputProgressBar.this.music.getStart() + i;
>>>>
>>>>      putPage(bufferedImage, pageNo, 0, 0, width, height);
>>>>
>>>>      }
>>>>
>>>>    }
>>>>
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 23, 2015 at 1:06 PM, Tilman Hausherr <THausherr@t-online.de
>>>> <mailto:THausherr@t-online.de>> wrote:
>>>>
>>>>      Am 23.09.2015 um 17:33 schrieb Tim Daley:
>>>>
>>>>          It appears that PDFBOX 2 handles scanned documents differently
>>>>          than PDFBOX
>>>>          1.
>>>>
>>>>          I have multipage PDFs that I have scanned from a
>>>>          Konica/Minolta C224e. The
>>>>          PDFs in version 1 seemed to come in as a single image. Now in
>>>>          version 2,
>>>>          they seem to come in as multiple images. I assume this is to
>>>>          reduce the
>>>>          size of the resultant PDFs.
>>>>
>>>>          Is there a way to retrieve each page as a single image or is
>>>>          there a method
>>>>          to merge all the images on a page into a single image?
>>>>
>>>>
>>>>      Can't comment without having a sample PDF. And I don't know what
>>>>      you mean with "seemed to come in as a single image".
>>>>
>>>>      Tilman
>>>>
>>>>      ---------------------------------------------------------------------
>>>>      To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>      <mailto:users-unsubscribe@pdfbox.apache.org>
>>>>      For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>      <mailto:users-help@pdfbox.apache.org>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Tim Daley*
>>>> IT Specialist-Operating Systems
>>>> cru | Engagement & Services | Platform Team
>>>> o:407-826-2911 | m:407-716-0284
>>>> tim.daley@cru.org <mailto:tim.daley@cru.org>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>
>> --
>> *Tim Daley*
>> IT Specialist-Operating Systems
>> cru | Engagement & Services | Platform Team
>> o: 407-826-2911 | m: 407-716-0284
>> tim.daley@cru.org
>>
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message