pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: Getting Out of Memory Error when trying to parse and extract text of 8 MB PDF Document
Date Wed, 06 Feb 2013 09:51:15 GMT
could you share a code snippet and maybe a sample PDF with us so we can verify how you are
doing the extraction and check the PDF itself. Which memory settings are you using?

Kind regards

Maruan Sahyoun

Am 06.02.2013 um 10:28 schrieb VIGNESH S <vigneshklncit@gmail.com>:

> ya, I had tried some 3 months back..I have the same problem..
> 
> I did not tried in the recent release
> 
> On Wed, Feb 6, 2013 at 12:08 AM, Andreas Lehmkuehler <andreas@lehmi.de> wrote:
>> Hi,
>> 
>> Am 05.02.2013 15:20, schrieb VIGNESH S:
>> 
>>> I think non sequential PDF Parser also loads everyobjects in Objectpool..
>>> 
>>> The diffrence I think in nonsequential is that it reads the Xref table
>>> in trailer to know the PDF structure instead of linearly traversing
>>> the document.
>> 
>> Yes, it works different (following the specs) than the old one.
>> Did you try it?
>> 
>> 
>>> Correct me if Iam wrong.
>> 
>> 
>> BR
>> Andreas Lehmkühler
> 
> 
> 
> -- 
> Thanks and Regards
> Vignesh Srinivasan
> 9739135640


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message