pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maruan Sahyoun (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PDFBOX-4569) Implement an ondemand Parser
Date Mon, 17 Jun 2019 06:26:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865356#comment-16865356
] 

Maruan Sahyoun edited comment on PDFBOX-4569 at 6/17/19 6:25 AM:
-----------------------------------------------------------------

Looking at the current code this is a huge step forward for a lot of common usage patterns.

Given the comments above we might be missing a typical use case though which is limiting the
memory consumption even when going through the complete PDF. 

That could be implemented with the on demand parser in two ways I'd think
- either by a kind of drop call where parsed objects will be freed
- by freeing them after they have been referenced but leaving it to a higher level to cache
these

Thoughts?

I'm also happy if the case outlined is not covered but I think it's important that we communicate
that clearly - i.e. where the benefits are and which cases might not benefit from the on demand
approach.    


was (Author: msahyoun):
Looking at the current code this is a huge step forward for a lot of common usage patterns.

Given the comments above we might be missing a typical use case though which is limiting the
memory consumption even when going through the complete PDF. 

That could be implemented with the on demand parser in two ways I'd think
- either by a kind of drop call where parsed objects will be freed
- by freeing them after they have been referenced but leaving it to a gigher level to cache
these

Thoughts?

I'm also happy if the case outlined is not covered but I think it's important that we communicate
that clearly - i.e. where the benefits are and which cases might not benefit from the on demand
approach.    

> Implement an ondemand Parser
> ----------------------------
>
>                 Key: PDFBOX-4569
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4569
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Parsing
>    Affects Versions: 3.0.0 PDFBox
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>             Fix For: 3.0.0 PDFBox
>
>         Attachments: PDFBOX-1084.pdf
>
>
> There is a need to replace the big bang parser with an ondemand parser



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Mime
View raw message