james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bagnara <apa...@bago.org>
Subject Re: Switching from 0.6 to 0.7.1 - address parsing, CRLF issues
Date Fri, 16 Dec 2011 18:52:26 GMT
2011/12/16 Lukáš Vlček <lukas.vlcek@gmail.com>:
> Hi Stefano,
>
> is there any workaround?

It is already fixed in trunk. So the workaround is using a recent snapshot!

Stefano

> Regards,
> Lukas
>
> On Fri, Dec 16, 2011 at 5:44 PM, Stefano Bagnara <apache@bago.org> wrote:
>
>> I guess this is a bug in 0.7.1:
>> https://issues.apache.org/jira/browse/MIME4J-208
>>
>> Stefano
>>
>> 2011/12/16 Lukáš Vlček <lukas.vlcek@gmail.com>:
>> > Hey again,
>> >
>> > I think I found another difference between 0.6 and 0.7.1
>> >
>> > It is about parsing the "Date: Mon, 26 Mar 2007 12:11:10 +0800" header
>> field
>> >
>> > Given the following mbox file source:
>> >
>> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/resources/mbox/hibernate-announce-01.mbox
>> >
>> > I am getting different results.
>> >
>> > 0.7.1
>> > it is translated into "2007/03/25 16:11:10" (UTC based)
>> >
>> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L132
>> >
>> > 0.6
>> > it is translated into "2007/03/26 04:11:10" (UTC based)
>> >
>> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L129
>> >
>> > Why I am getting this difference?
>> >
>> > Regards,
>> > Lukas
>> >
>> > On Fri, Dec 16, 2011 at 1:21 PM, Lukáš Vlček <lukas.vlcek@gmail.com>
>> wrote:
>> >
>> >> Hi Stefano,
>> >>
>> >> I was not aware of the RCF line breaks specification. Thanks!
>> >>
>> >> I will let you know if I encounter any other issues. Thanks a lot guys.
>> >>
>> >> Regards,
>> >> Lukas
>> >>
>> >>
>> >> On Thu, Dec 15, 2011 at 3:32 PM, Stefano Bagnara <apache@bago.org>
>> wrote:
>> >>
>> >>> I guess mime4j 0.6 output was not mime compliant.
>> >>> MIME requires newlines in text parts to use CRLF (\r\n) as line
>> >>> separators and also says that CR and LF are not allowed in a text part
>> >>> other than in the line separator sequence.
>> >>>
>> >>> From RFC2046:
>> >>> ---
>> >>> 4.1.1.  Representation of Line Breaks
>> >>>
>> >>>   The canonical form of any MIME "text" subtype MUST always represent
a
>> >>>   line break as a CRLF sequence.  Similarly, any occurrence of CRLF
in
>> >>>   MIME "text" MUST represent a line break.  Use of CR and LF outside
of
>> >>>   line break sequences is also forbidden.
>> >>> ---
>> >>>
>> >>> Most email clients accept LF (\n) line separators, but CRLF is the
>> right
>> >>> one.
>> >>>
>> >>> So in 0.7 in fixed this bug.
>> >>>
>> >>> 0.6 vs 0.7 differences aside, are you experiencing issues with the
>> >>> CRLF used by mime4j 0.7 ?
>> >>>
>> >>> Stefano
>> >>>
>> >>> 2011/12/15 Lukáš Vlček <lukas.vlcek@gmail.com>:
>> >>> > Hey again,
>> >>> >
>> >>> > I did downgraded my code to 0.6 version to see what differences
I
>> will
>> >>> get.
>> >>> >
>> >>> > Unfortunatelly, I was not able to prove that the below
>> >>> > mentioned message.getHeader().getField("from").getBody() did decoding
>> >>> > however, I was able to show that I am getting different content
from
>> >>> > TextBody.
>> >>> >
>> >>> > There are two branches in my github repo now:
>> >>> > - workaboud (using mime4j 0.7.2)
>> >>> > - backto06 (using mime4j 0.6)
>> >>> >
>> >>> > I would like to point out the following parts of my test:
>> >>> >
>> >>>
>> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L112
>> >>> > vs.
>> >>> >
>> >>>
>> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L110
>> >>> >
>> >>> > The first is using 0.7.2 and I am getting "\r\n" sequence where
using
>> >>> 0.6 I
>> >>> > am getting only "\n".
>> >>> >
>> >>> > I am not saying there is a bug in Mime4J but I would like to
>> understand
>> >>> > what has changed and why I am getting different results using
>> different
>> >>> > mime4j version. As you can see I did not change anything important
in
>> >>> any
>> >>> > of util classes between "workaround" and "backto06" branches:
>> >>> >
>> >>>
>> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06
>> >>> >
>> >>> > The only important change (except different version of mime4j)
is in
>> >>> > ParseUtil class where I had to drop MessageBuilder logic:
>> >>> >
>> >>>
>> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06#diff-2
>> >>> >
>> >>> > Any idea why I am getting "\r\n" chars instead of "\n"?
>> >>> >
>> >>> > Regards,
>> >>> > Lukas
>> >>> >
>> >>> > On Thu, Dec 15, 2011 at 1:19 PM, Lukáš Vlček <lukas.vlcek@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> OK, got it.
>> >>> >> (Although in 0.6 it was returning decoded content)
>> >>> >>
>> >>> >> Regards,
>> >>> >> Lukas
>> >>> >>
>> >>> >>
>> >>> >> On Thu, Dec 15, 2011 at 1:16 PM, Oleg Kalnichevski <
>> olegk@apache.org
>> >>> >wrote:
>> >>> >>
>> >>> >>> On Thu, 2011-12-15 at 11:41 +0100, Lukáš Vlček wrote:
>> >>> >>> > Hi,
>> >>> >>> >
>> >>> >>> > I tried it and it works. Thanks!
>> >>> >>> >
>> >>> >>> > However, still I am not sure if this fixed everything.
>> >>> >>> > See the following commit in my test repo on github
(I added a new
>> >>> branch
>> >>> >>> > called "workaround")
>> >>> >>> >
>> >>> >>>
>> >>>
>> https://github.com/lukas-vlcek/mime4j-test/commit/385c66847bec4393ad67069fb367c174f87c5656
>> >>> >>> >
>> >>> >>> > As you can see the call to  message.getFrom().get(0).getName()
>> >>> returns
>> >>> >>> > expected data, but message.getHeader().getField("from").getBody()
>> >>> does
>> >>> >>> not.
>> >>> >>> > At least that is how I understand its JavaDoc:
>> >>> >>> >
>> >>> >>>
>> >>>
>> http://james.apache.org/mime4j/apidocs/org/apache/james/mime4j/stream/Field.html#getBody()
>> >>> >>> >
>> >>> >>> > "Gets the unparsed and possibly encoded (see RFC 2047)
field body
>> >>> >>> string."
>> >>> >>> >
>> >>> >>> > How should I understand the "encoded" in this context?
>> >>> >>> >
>> >>> >>>
>> >>> >>> Encoded actually means, well, encoded, as specified in
RFC 2047.
>> >>> >>>
>> >>> >>> Oleg
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>
>> >>>
>> >>
>> >>
>>

Mime
View raw message