james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Bagnara (JIRA)" <mime4j-...@james.apache.org>
Subject [jira] [Commented] (MIME4J-116) Avoid duplicate parsing of header fields
Date Tue, 21 Jun 2011 11:31:47 GMT

    [ https://issues.apache.org/jira/browse/MIME4J-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052489#comment-13052489

Stefano Bagnara commented on MIME4J-116:

I was not thinking about stateful MutableBodyDescriptor. Instead I was thinking about allowing
MutableBodyDescriptor.addField to return a replacement Field for the RawField received in
input. If the return value is non null then the caller can use it instead of the original.

The fact is that the MutableBodyDescriptors already have parsing logic (in fact field parsing
is their main job right now) and they are the place where further parsing should happen: maybe
the problem is the name of this "interface" but I think that it would not be an hack (for
sure not ugly) to have that object to deal with all parsing logic.

MutableBodyDescriptor could already be an user of the parsed fields, but if the parsing logic
is *optional* and happen *previously* then it will anyway have to check if the field has been
parsed or parse it again. Also, if we move it to the BodyDescriptor it will allow to run the
parse only for the field you care during the parsing and not for every field defined in the
FieldParser (yes, we can replace them with lazy parsedfields, but this way we could even avoid
this step).

WDYT? I'm still playing with the code, so maybe I missed some important thing, but if you
don't think this is an option maybe I should stop playing with this and look elsewhere.

> Avoid duplicate parsing of header fields
> ----------------------------------------
>                 Key: MIME4J-116
>                 URL: https://issues.apache.org/jira/browse/MIME4J-116
>             Project: JAMES Mime4j
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Markus Wiederkehr
>             Fix For: 0.7
> Currently some header fields are parsed twice when building a DOM. Once by DefaultBodyDescriptor
or MaximalBodyDescriptor and a second time by MessageBuilder using Field.parse().
> Also different parsers are used in both stages. The body descriptors use handcrafted
parsers whereas Field.parse uses JavaCC generated parsers. The handcrafted version does not
seem to handle comments in a header correctly.
> The situation should be improved by parsing a header field only once and passing that
already parsed field to a content handler. Also only one sort of field parser should be used;
either handcrafted or generated. My personal opinion is that it might be easier for a handcrafted
parser to be more tolerant against malformed header fields.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message