pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: PDFBOX 2 scanned documents
Date Wed, 23 Sep 2015 17:31:10 GMT
The XObjects should be the same count in version 1 and 2.

If you don't want to share the PDFs, then look at them with the new 
PDFDebugger. You can see the XObject images easily.

Tilman

Am 23.09.2015 um 19:21 schrieb Tim Daley:
> Here's the basic code that used to work. Granted, it probably depends 
> heavily on Version 1's structure.
>
>
> PDPage pdPage = CFCAPDFInputProgressBar.this.pdPages.get(i);
>
> Map<COSName, PDXObject> images = new TreeMap<COSName, PDXObject>();
>
> PDResources pdResources = pdPage.getResources();
>
> for(Entry<COSName, PDXObject> objectImageEntry:images.entrySet())
>
> {
>
>   PDXObject pdXObject = objectImageEntry.getValue();
>
>   if (pdXObject instanceof PDImageXObject)
>
>   {
>
>     PDImageXObject pdXObjectImage= ((PDImageXObject)pdXObject);
>
>     BufferedImage bufferedImage = null;
>
>     try{bufferedImage= pdXObjectImage.getImage();}
>
> catch(Throwable t)
>
>     {
>
>       t.printStackTrace();
>
>       randomAccessFile.close();
>
>       throw new RuntimeException(t);
>
>     }
>
>     if (CFCAPDFInputProgressBar.this.music.getLandscape())
>
>       bufferedImage= rotate90DX(bufferedImage);
>
> int width = bufferedImage.getWidth();
>
> int height = bufferedImage.getHeight();
>
> if (CFCAPDFInputProgressBar.this.music.getTwoPage())
>
>     {
>
>       width /= 2;
>
>       boolean even = i%2 == 0;
>
>       intrightPageNo= even?i+1:pageCount*2-i;
>
>       intleftPageNo= even?pageCount*2-i:i+1;
>
>       putPage(bufferedImage, rightPageNo, width, 0, width, height);
>
>       putPage(bufferedImage, leftPageNo, 0, 0, width, height);
>
>     }
>
> else
>
>     {
>
>       int pageNo = CFCAPDFInputProgressBar.this.music.getStart() + i;
>
>     putPage(bufferedImage, pageNo, 0, 0, width, height);
>
>     }
>
>   }
>
> }
>
>
>
>
>
> On Wed, Sep 23, 2015 at 1:06 PM, Tilman Hausherr 
> <THausherr@t-online.de <mailto:THausherr@t-online.de>> wrote:
>
>     Am 23.09.2015 um 17:33 schrieb Tim Daley:
>
>         It appears that PDFBOX 2 handles scanned documents differently
>         than PDFBOX
>         1.
>
>         I have multipage PDFs that I have scanned from a
>         Konica/Minolta C224e. The
>         PDFs in version 1 seemed to come in as a single image. Now in
>         version 2,
>         they seem to come in as multiple images. I assume this is to
>         reduce the
>         size of the resultant PDFs.
>
>         Is there a way to retrieve each page as a single image or is
>         there a method
>         to merge all the images on a page into a single image?
>
>
>     Can't comment without having a sample PDF. And I don't know what
>     you mean with "seemed to come in as a single image".
>
>     Tilman
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>     <mailto:users-unsubscribe@pdfbox.apache.org>
>     For additional commands, e-mail: users-help@pdfbox.apache.org
>     <mailto:users-help@pdfbox.apache.org>
>
>
>
>
> -- 
> *Tim Daley*
> IT Specialist-Operating Systems
> cru | Engagement & Services | Platform Team
> o:407-826-2911 | m:407-716-0284
> tim.daley@cru.org <mailto:tim.daley@cru.org>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message