james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bagnara <apa...@bago.org>
Subject Re: Switching from 0.6 to 0.7.1 - address parsing, CRLF issues
Date Fri, 16 Dec 2011 16:44:24 GMT
I guess this is a bug in 0.7.1:
https://issues.apache.org/jira/browse/MIME4J-208

Stefano

2011/12/16 Lukáš Vlček <lukas.vlcek@gmail.com>:
> Hey again,
>
> I think I found another difference between 0.6 and 0.7.1
>
> It is about parsing the "Date: Mon, 26 Mar 2007 12:11:10 +0800" header field
>
> Given the following mbox file source:
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/resources/mbox/hibernate-announce-01.mbox
>
> I am getting different results.
>
> 0.7.1
> it is translated into "2007/03/25 16:11:10" (UTC based)
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L132
>
> 0.6
> it is translated into "2007/03/26 04:11:10" (UTC based)
> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L129
>
> Why I am getting this difference?
>
> Regards,
> Lukas
>
> On Fri, Dec 16, 2011 at 1:21 PM, Lukáš Vlček <lukas.vlcek@gmail.com> wrote:
>
>> Hi Stefano,
>>
>> I was not aware of the RCF line breaks specification. Thanks!
>>
>> I will let you know if I encounter any other issues. Thanks a lot guys.
>>
>> Regards,
>> Lukas
>>
>>
>> On Thu, Dec 15, 2011 at 3:32 PM, Stefano Bagnara <apache@bago.org> wrote:
>>
>>> I guess mime4j 0.6 output was not mime compliant.
>>> MIME requires newlines in text parts to use CRLF (\r\n) as line
>>> separators and also says that CR and LF are not allowed in a text part
>>> other than in the line separator sequence.
>>>
>>> From RFC2046:
>>> ---
>>> 4.1.1.  Representation of Line Breaks
>>>
>>>   The canonical form of any MIME "text" subtype MUST always represent a
>>>   line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
>>>   MIME "text" MUST represent a line break.  Use of CR and LF outside of
>>>   line break sequences is also forbidden.
>>> ---
>>>
>>> Most email clients accept LF (\n) line separators, but CRLF is the right
>>> one.
>>>
>>> So in 0.7 in fixed this bug.
>>>
>>> 0.6 vs 0.7 differences aside, are you experiencing issues with the
>>> CRLF used by mime4j 0.7 ?
>>>
>>> Stefano
>>>
>>> 2011/12/15 Lukáš Vlček <lukas.vlcek@gmail.com>:
>>> > Hey again,
>>> >
>>> > I did downgraded my code to 0.6 version to see what differences I will
>>> get.
>>> >
>>> > Unfortunatelly, I was not able to prove that the below
>>> > mentioned message.getHeader().getField("from").getBody() did decoding
>>> > however, I was able to show that I am getting different content from
>>> > TextBody.
>>> >
>>> > There are two branches in my github repo now:
>>> > - workaboud (using mime4j 0.7.2)
>>> > - backto06 (using mime4j 0.6)
>>> >
>>> > I would like to point out the following parts of my test:
>>> >
>>> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L112
>>> > vs.
>>> >
>>> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L110
>>> >
>>> > The first is using 0.7.2 and I am getting "\r\n" sequence where using
>>> 0.6 I
>>> > am getting only "\n".
>>> >
>>> > I am not saying there is a bug in Mime4J but I would like to understand
>>> > what has changed and why I am getting different results using different
>>> > mime4j version. As you can see I did not change anything important in
>>> any
>>> > of util classes between "workaround" and "backto06" branches:
>>> >
>>> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06
>>> >
>>> > The only important change (except different version of mime4j) is in
>>> > ParseUtil class where I had to drop MessageBuilder logic:
>>> >
>>> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06#diff-2
>>> >
>>> > Any idea why I am getting "\r\n" chars instead of "\n"?
>>> >
>>> > Regards,
>>> > Lukas
>>> >
>>> > On Thu, Dec 15, 2011 at 1:19 PM, Lukáš Vlček <lukas.vlcek@gmail.com>
>>> wrote:
>>> >
>>> >> OK, got it.
>>> >> (Although in 0.6 it was returning decoded content)
>>> >>
>>> >> Regards,
>>> >> Lukas
>>> >>
>>> >>
>>> >> On Thu, Dec 15, 2011 at 1:16 PM, Oleg Kalnichevski <olegk@apache.org
>>> >wrote:
>>> >>
>>> >>> On Thu, 2011-12-15 at 11:41 +0100, Lukáš Vlček wrote:
>>> >>> > Hi,
>>> >>> >
>>> >>> > I tried it and it works. Thanks!
>>> >>> >
>>> >>> > However, still I am not sure if this fixed everything.
>>> >>> > See the following commit in my test repo on github (I added
a new
>>> branch
>>> >>> > called "workaround")
>>> >>> >
>>> >>>
>>> https://github.com/lukas-vlcek/mime4j-test/commit/385c66847bec4393ad67069fb367c174f87c5656
>>> >>> >
>>> >>> > As you can see the call to  message.getFrom().get(0).getName()
>>> returns
>>> >>> > expected data, but message.getHeader().getField("from").getBody()
>>> does
>>> >>> not.
>>> >>> > At least that is how I understand its JavaDoc:
>>> >>> >
>>> >>>
>>> http://james.apache.org/mime4j/apidocs/org/apache/james/mime4j/stream/Field.html#getBody()
>>> >>> >
>>> >>> > "Gets the unparsed and possibly encoded (see RFC 2047) field
body
>>> >>> string."
>>> >>> >
>>> >>> > How should I understand the "encoded" in this context?
>>> >>> >
>>> >>>
>>> >>> Encoded actually means, well, encoded, as specified in RFC 2047.
>>> >>>
>>> >>> Oleg
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>>
>>
>>

Mime
View raw message