james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Bagnara (JIRA)" <mime4j-...@james.apache.org>
Subject [jira] Commented: (MIME4J-58) Lenient dealing with headless messages or malformed header/body separation
Date Fri, 01 Jan 2010 16:05:54 GMT

    [ https://issues.apache.org/jira/browse/MIME4J-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795761#action_12795761
] 

Stefano Bagnara commented on MIME4J-58:
---------------------------------------

You're right. I'll add an exception if unread is called before a temp buffer has been completed.
Calling multiple times "unread" should not be allowed at all by the contract. unread is there
to be able to push back the *last* read line, nothing more than this.

> Lenient dealing with headless messages or malformed header/body separation
> --------------------------------------------------------------------------
>
>                 Key: MIME4J-58
>                 URL: https://issues.apache.org/jira/browse/MIME4J-58
>             Project: JAMES Mime4j
>          Issue Type: Task
>    Affects Versions: 0.3
>            Reporter: Stefano Bagnara
>            Assignee: Stefano Bagnara
>             Fix For: 0.8
>
>         Attachments: headerbody-nocrlfcrlf.msg, headerbody-noheader.msg
>
>
> Define how to deal with non canonical messages like this one:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> In the first case mime4j output twice an  "invalid header" error and a roundtrip write
result in an empty message.
> In the SMTP case this is unfortunate because sometimes it happens messages are sent without
header.
> In the second case mime4j currenlty take Subject and AnotherHeader as headers and "This
is an invalid header" raise a monitor for "invalid header" and "Body text" is considered the
body.
> A compromise we evaluated in past between compliance, leniency and performace was to
"alter" the requirement for CRLFCRLF between headers and body with a different rule: if during
parsing of the headers we find a line (not multiline) and not including an "HeaderName: something"
then we virtually add a CRLF *before* that line and consider that line the first line of the
body. This allow us to only buffer a single line (as opposite to parsing the whole message
in search of a CRLFCRLF and consider the full message a body if no CRLFCRLF is found) and
to be very lenient with input. The "side effect" (maybe not bad) is that a wrong header in
the middle of headers will result in some headers moved to the body.
> With this algorythm the above would be "virtually" parsed as it was:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> If we think about strict and lenient approaches I think that current mime4j result is
ok when using a strict parsing, while the one I propose is a good lenient alternative.
> Opinions? Alternatives?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message