pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glenn Hirshon <Glenn.Hirs...@ozcap.com>
Subject RE: merging images from a compact-pdf file
Date Mon, 29 Nov 2010 17:21:10 GMT
Original post:  I've been using pdfbox to do processing on documents 
scanned with our office copy machines (Cannon and Ricoh machines). 
Normally, the resulting pdf files contain one TIF file per page and the 
page.converttoimage() function works fine to extract the image. One of the 
machines has a setting to produce 'compact pdf' files which produces a 
smaller file. When the compact feature is turned on, instead of a single 
TIF image being stored in the page, multiple images are stored which need 
to be reassembled into a single image through some type of merging 
process. 
I am able to extract the separate images but I'm missing the roadmap on 
how to size the images and recombine them. Is there some sort of property 
which provides relative x,y coordinates so I can recombine using a 
graphics drawImage method? 

I've been trying all sorts of variations using some of the techniques in 
the PrintImageLocations.html sample.  Sadly, i'm not seeing how three 
separate images are contained inside the crop box - apparently with 
different scaling and x,y coordinates.  I can extract the three images, 
get their sizes and also get the size of the crop box.  I've tried 
assembling the images back together again by creating a new buffered image 
using the size of the crop box, and then doing an AlphaComposite to layer 
each of the three images on top of each other.

Here is the pdf file I've been testing with and my java class, in case 
anyone has any ideas:

The idea would be to call the pdfImageProcess class, getImage method, and 
pass in a single PDF page as follows:

PDPage page = (PDPage)pages.get( i );
pdfImageProcess pdfIP = new pdfImageProcess();
image = pdfIP.getImage(page);

The debug output on the three images looks like:
Found image[Obj4] at 0.0,0.0 size=7803.0,13068.0, xScale=6.12, yScale=7.92
Found image[Obj5] at 36.48,465.6 size=11616.0,2457.9458, xScale=5.28, 
yScale=2.4288
Found image[Obj6] at 34.56,48.96 size=8995.431,8.639999, xScale=4.6464, 
yScale=0.144











Subject:
RES: merging images from a compact-pdf file

From:
José Rodolfo Carrijo de Freitas (jose...@softplan.com.br)

Date:
Nov 23, 2010 10:21:20 am

List:
org.apache.pdfbox.users



Hello Glenn, There is an example that show how to do that. 
Is a class called PrintImageLocations, the problem is to process the 
entire stream to find this information. Maybe you can adapt it to process 
the stream once and store those locations in a data structure. 
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/examples/util/PrintImageL 
ocations.html 
-----Mensagem original----- De: GlennHirshon [mailto:Glen...@ozcap.com] 
Enviada em: terça-feira, 23 de novembro de 2010 16:16 Para: users Assunto: 
merging images from a compact-pdf file 
I've been using pdfbox to do processing on documents scanned with our 
office copy machines (Cannon and Ricoh machines). Normally, the resulting 
pdf files contain one TIF file per page and the page.converttoimage() 
function works fine to extract the image. One of the machines has a 
setting to produce 'compact pdf' files which produces a smaller file. When 
the compact feature is turned on, instead of a single TIF image being 
stored in the page, multiple images are stored which need to be 
reassembled into a single image through some type of merging process. 
I am able to extract the separate images but I'm missing the roadmap on 
how to size the images and recombine them. Is there some sort of property 
which provides relative x,y coordinates so I can recombine using a 
graphics drawImage method? 





The information contained in this message and any attachment(s) may be 
privileged, confidential, proprietary or otherwise protected from 
disclosure and is intended solely for the use of the individual or entity 
to whom it is addressed. If you are not the intended recipient, you are 
hereby notified that any dissemination, distribution, copying or use of 
this message and any attachment is strictly prohibited and may be 
unlawful. If you have received this message in error, please notify us 
immediately by replying to this email and permanently delete the message 
from your computer. 

Nothing contained in this message and/or any attachment(s) constitutes a 
solicitation or an offer to buy or sell any securities. 
Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message