james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Bagnara (JIRA)" <mime4j-...@james.apache.org>
Subject [jira] Resolved: (MIME4J-58) Lenient dealing with headless messages or malformed header/body separation
Date Sat, 30 Jan 2010 19:20:34 GMT

     [ https://issues.apache.org/jira/browse/MIME4J-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefano Bagnara resolved MIME4J-58.
-----------------------------------

    Resolution: Fixed

Merged from cycleclean branch.

> Lenient dealing with headless messages or malformed header/body separation
> --------------------------------------------------------------------------
>
>                 Key: MIME4J-58
>                 URL: https://issues.apache.org/jira/browse/MIME4J-58
>             Project: JAMES Mime4j
>          Issue Type: Task
>    Affects Versions: 0.3
>            Reporter: Stefano Bagnara
>            Assignee: Stefano Bagnara
>             Fix For: 0.7
>
>         Attachments: headerbody-nocrlfcrlf.msg, headerbody-noheader.msg
>
>
> Define how to deal with non canonical messages like this one:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> In the first case mime4j output twice an  "invalid header" error and a roundtrip write
result in an empty message.
> In the SMTP case this is unfortunate because sometimes it happens messages are sent without
header.
> In the second case mime4j currenlty take Subject and AnotherHeader as headers and "This
is an invalid header" raise a monitor for "invalid header" and "Body text" is considered the
body.
> A compromise we evaluated in past between compliance, leniency and performace was to
"alter" the requirement for CRLFCRLF between headers and body with a different rule: if during
parsing of the headers we find a line (not multiline) and not including an "HeaderName: something"
then we virtually add a CRLF *before* that line and consider that line the first line of the
body. This allow us to only buffer a single line (as opposite to parsing the whole message
in search of a CRLFCRLF and consider the full message a body if no CRLFCRLF is found) and
to be very lenient with input. The "side effect" (maybe not bad) is that a wrong header in
the middle of headers will result in some headers moved to the body.
> With this algorythm the above would be "virtually" parsed as it was:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> If we think about strict and lenient approaches I think that current mime4j result is
ok when using a strict parsing, while the one I propose is a good lenient alternative.
> Opinions? Alternatives?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message