pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From VIGNESH S <vigneshkln...@gmail.com>
Subject Getting Out of Memory Error when trying to parse and extract text of 8 MB PDF Document
Date Mon, 28 Jan 2013 14:45:42 GMT

Tried extracting Text from a 8MB PDF Document.It is taking more than
64 MB Heap and gave out of memory when tested on android mobiles..

What i understand is PDFBOX is loading all objects in to objectpool
initially,which increases the Heap based on the number of objects in
PDF Document which looks like DOM Way of doing things..

Any Alternative memory Efficient SAX way of extracting text in PDFBOX.?.
Thanks and Regards
Vignesh Srinivasan

View raw message