pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <spr...@gmx.eu>
Subject RE: convertToImage -> OutOfMemoryError
Date Fri, 07 Jan 2011 16:41:48 GMT
> >> PDDocument doc = PDDocument.load(new File("big.pdf"));
> >> PDDocumentCatalog catalog = doc.getDocumentCatalog();
> >> List pages = catalog.getAllPages();
> >> for(Object o : pages){
> >>    PDPage page = (PDPage)o;
> >>    BufferedImage image = page.convertToImage();//<= OOME
> >>    ImageIO.write(image, "png", file);
> >> }
> >
> > OK, I got it. When the page is to big (e.g. really large 
> images) then an
> > OMME is thrown.
> > Is there any possiblity to find out the approx. size of the 
> page without
> > converting it into an image and catching the OOME?

> The final image size depends on the page size and the given 
> resolution. 

The Problem is that width and height seems to be always A4 (about 600x900
pixel) for any page in the document (via getWidth(), getHeight()).
The resolution is always 300 dpi. 
But in real some images are 7000x10000 pixels.

> I guess the origin size of the tiffs is probably the show stopper.

Maybe, but how can I determine this size?

> If you are just interested in the images, try to extract them using 
> ExtractImages, see [1] for further information.

No, not only images, text too.


Mime
View raw message