pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From VIGNESH S <vigneshkln...@gmail.com>
Subject Re: Getting Out of Memory Error when trying to parse and extract text of 8 MB PDF Document
Date Sat, 02 Feb 2013 06:09:29 GMT
Hi Andreas,

Do you have any suggestion

On Thu, Jan 31, 2013 at 6:52 PM, Andreas Lehmkühler <andreas@lehmi.de> wrote:
> Hi,
>
> Am 28.01.13 15:45, schrieb VIGNESH S:
>
>> Hi,
>>
>> Tried extracting Text from a 8MB PDF Document.It is taking more than
>> 64 MB Heap and gave out of memory when tested on android mobiles..
>>
>> What i understand is PDFBOX is loading all objects in to objectpool
>> initially,which increases the Heap based on the number of objects in
>> PDF Document which looks like DOM Way of doing things..
>>
>> Any Alternative memory Efficient SAX way of extracting text in PDFBOX.?.
>
> Try the new nonsequential parser using loadNonSeq() instead of load().
>
> BR
> Andreas Lehmkühler
>
>



-- 
Thanks and Regards
Vignesh Srinivasan
9739135640

Mime
View raw message