james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Wiederkehr (JIRA)" <mime4j-...@james.apache.org>
Subject [jira] Commented: (MIME4J-112) Define Limits Of Round Tripping In Mime4J
Date Sat, 07 Feb 2009 11:52:59 GMT

    [ https://issues.apache.org/jira/browse/MIME4J-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671454#action_12671454

Markus Wiederkehr commented on MIME4J-112:

> 1. Preservation of comment data after parsing fields

This should not be a problem since every Field stores the original raw field string. The raw
field string is used when writing the message. The only information lost is the kind of line
delimiter that follows the field but this could easily be preserved, too.

> Another difficulty for unlimited round tripping (without preserving the original bits)
is how to record the header wrapping for unconventional wrapping schemes. For example, a message
may choose to wrap header values early but this information is lost during parsing.

It is not - see above.

> 2. Preservation of information about character encoding in headers

The field string is built by AbstractEntity using ByteArrayBuffer and CharArrayBuffer. The
CharArrayBuffer uses the following code for converting an input byte into a character: 

            int ch = b[i1]; 
            if (ch < 0) {
                ch = 256 + ch;

It might not be obvious but this is ISO-8859-1 conversion (because unicode code points 0000
to 00FF correspond directly to ISO-8859-1 byte codes 00 to FF).

So we would only have to use Latin 1 for writing the header fields..

> 3. Ability to build mail which does comply with the specifications

Unclear to me; what specification are you referring to and how is this related to round tripping?

> My feeling is that - given the availability of standard meta-data+document representations
- Mime4J should support only limited round tripping for mail building representations.

I don't agree because I think that perfect round tripping might be a prerequisite for S/MIME
canonicalization (MIME4J-113). Canonicalization is useless if bits of the original content
have already been lost.

>From my point of view Mime4j also has to preserve to the original transfer encodings.
Quoted-printable (even base64) cannot be re-encoded the same way it was. This might become
nasty with inner encodings, for example a message might contain another message that is transfer
encoded entirely. Mime4j would have to parse that inner message only on demand.

Preserving the original transfer encodings clearly causes some overhead and should be optional

I think there is not much else to it. The kind of line delimiters between header and body

> Define Limits Of Round Tripping In Mime4J
> -----------------------------------------
>                 Key: MIME4J-112
>                 URL: https://issues.apache.org/jira/browse/MIME4J-112
>             Project: JAMES Mime4j
>          Issue Type: Task
>    Affects Versions: 0.6
>            Reporter: Robert Burrell Donkin
>             Fix For: 0.7
> By round tripping, I mean parsing some MIME document into a fully decomposed form and
then recreating a new version of the document from this form. 
> In theory, Mime4J decomposition and recomposition could be made perfect with no loss
of information. In other words, given a MIME document, the parser could completely decompose
the document and a bitwise identical copy could be recomposed.
> In practice, the limits of support are questionable. Some limitations may be expedient.
For example, perhaps comments and encoding of ASCII characters are not sufficiently important
to be worth preserving. Other limitations may arise from MIME documents which are not strictly
compliant with the specification - for example, the use of unescaped non-ASCII characters
in MIME headers may mean that the output would need to be escaped to ensure compliance.
> It is important to define and describe the limits of round tripping so that users and
developers are clear about the level of support MIme4J claims. In addition, sufficient unit
tests should be created to ensure in confidence that  documents within these limits are correctly

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message