james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukáš Vlček <lukas.vl...@gmail.com>
Subject Re: Switching from 0.6 to 0.7.1 - address parsing, CRLF issues
Date Sat, 17 Dec 2011 13:45:17 GMT
Yea, that is what I have to do now. Copy&paste implementation from trunk
into custom DateFieldParser.

On Sat, Dec 17, 2011 at 1:55 PM, Oleg Kalnichevski <olegk@apache.org> wrote:

> On Fri, 2011-12-16 at 19:52 +0100, Stefano Bagnara wrote:
> > 2011/12/16 Lukáš Vlček <lukas.vlcek@gmail.com>:
> > > Hi Stefano,
> > >
> > > is there any workaround?
> >
> > It is already fixed in trunk. So the workaround is using a recent
> snapshot!
> >
>
> Lukáš
>
> Alternatively you can always use a custom field parser as well
>
> Oleg
>
> > Stefano
> >
> > > Regards,
> > > Lukas
> > >
> > > On Fri, Dec 16, 2011 at 5:44 PM, Stefano Bagnara <apache@bago.org>
> wrote:
> > >
> > >> I guess this is a bug in 0.7.1:
> > >> https://issues.apache.org/jira/browse/MIME4J-208
> > >>
> > >> Stefano
> > >>
> > >> 2011/12/16 Lukáš Vlček <lukas.vlcek@gmail.com>:
> > >> > Hey again,
> > >> >
> > >> > I think I found another difference between 0.6 and 0.7.1
> > >> >
> > >> > It is about parsing the "Date: Mon, 26 Mar 2007 12:11:10 +0800"
> header
> > >> field
> > >> >
> > >> > Given the following mbox file source:
> > >> >
> > >>
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/resources/mbox/hibernate-announce-01.mbox
> > >> >
> > >> > I am getting different results.
> > >> >
> > >> > 0.7.1
> > >> > it is translated into "2007/03/25 16:11:10" (UTC based)
> > >> >
> > >>
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L132
> > >> >
> > >> > 0.6
> > >> > it is translated into "2007/03/26 04:11:10" (UTC based)
> > >> >
> > >>
> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L129
> > >> >
> > >> > Why I am getting this difference?
> > >> >
> > >> > Regards,
> > >> > Lukas
> > >> >
> > >> > On Fri, Dec 16, 2011 at 1:21 PM, Lukáš Vlček <lukas.vlcek@gmail.com
> >
> > >> wrote:
> > >> >
> > >> >> Hi Stefano,
> > >> >>
> > >> >> I was not aware of the RCF line breaks specification. Thanks!
> > >> >>
> > >> >> I will let you know if I encounter any other issues. Thanks a
lot
> guys.
> > >> >>
> > >> >> Regards,
> > >> >> Lukas
> > >> >>
> > >> >>
> > >> >> On Thu, Dec 15, 2011 at 3:32 PM, Stefano Bagnara <apache@bago.org>
> > >> wrote:
> > >> >>
> > >> >>> I guess mime4j 0.6 output was not mime compliant.
> > >> >>> MIME requires newlines in text parts to use CRLF (\r\n) as
line
> > >> >>> separators and also says that CR and LF are not allowed in
a text
> part
> > >> >>> other than in the line separator sequence.
> > >> >>>
> > >> >>> From RFC2046:
> > >> >>> ---
> > >> >>> 4.1.1.  Representation of Line Breaks
> > >> >>>
> > >> >>>   The canonical form of any MIME "text" subtype MUST always
> represent a
> > >> >>>   line break as a CRLF sequence.  Similarly, any occurrence
of
> CRLF in
> > >> >>>   MIME "text" MUST represent a line break.  Use of CR and
LF
> outside of
> > >> >>>   line break sequences is also forbidden.
> > >> >>> ---
> > >> >>>
> > >> >>> Most email clients accept LF (\n) line separators, but CRLF
is the
> > >> right
> > >> >>> one.
> > >> >>>
> > >> >>> So in 0.7 in fixed this bug.
> > >> >>>
> > >> >>> 0.6 vs 0.7 differences aside, are you experiencing issues
with the
> > >> >>> CRLF used by mime4j 0.7 ?
> > >> >>>
> > >> >>> Stefano
> > >> >>>
> > >> >>> 2011/12/15 Lukáš Vlček <lukas.vlcek@gmail.com>:
> > >> >>> > Hey again,
> > >> >>> >
> > >> >>> > I did downgraded my code to 0.6 version to see what differences
> I
> > >> will
> > >> >>> get.
> > >> >>> >
> > >> >>> > Unfortunatelly, I was not able to prove that the below
> > >> >>> > mentioned message.getHeader().getField("from").getBody()
did
> decoding
> > >> >>> > however, I was able to show that I am getting different
content
> from
> > >> >>> > TextBody.
> > >> >>> >
> > >> >>> > There are two branches in my github repo now:
> > >> >>> > - workaboud (using mime4j 0.7.2)
> > >> >>> > - backto06 (using mime4j 0.6)
> > >> >>> >
> > >> >>> > I would like to point out the following parts of my test:
> > >> >>> >
> > >> >>>
> > >>
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L112
> > >> >>> > vs.
> > >> >>> >
> > >> >>>
> > >>
> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L110
> > >> >>> >
> > >> >>> > The first is using 0.7.2 and I am getting "\r\n" sequence
where
> using
> > >> >>> 0.6 I
> > >> >>> > am getting only "\n".
> > >> >>> >
> > >> >>> > I am not saying there is a bug in Mime4J but I would
like to
> > >> understand
> > >> >>> > what has changed and why I am getting different results
using
> > >> different
> > >> >>> > mime4j version. As you can see I did not change anything
> important in
> > >> >>> any
> > >> >>> > of util classes between "workaround" and "backto06" branches:
> > >> >>> >
> > >> >>>
> > >>
> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06
> > >> >>> >
> > >> >>> > The only important change (except different version of
mime4j)
> is in
> > >> >>> > ParseUtil class where I had to drop MessageBuilder logic:
> > >> >>> >
> > >> >>>
> > >>
> https://github.com/lukas-vlcek/mime4j-test/compare/workaround...backto06#diff-2
> > >> >>> >
> > >> >>> > Any idea why I am getting "\r\n" chars instead of "\n"?
> > >> >>> >
> > >> >>> > Regards,
> > >> >>> > Lukas
> > >> >>> >
> > >> >>> > On Thu, Dec 15, 2011 at 1:19 PM, Lukáš Vlček <
> lukas.vlcek@gmail.com>
> > >> >>> wrote:
> > >> >>> >
> > >> >>> >> OK, got it.
> > >> >>> >> (Although in 0.6 it was returning decoded content)
> > >> >>> >>
> > >> >>> >> Regards,
> > >> >>> >> Lukas
> > >> >>> >>
> > >> >>> >>
> > >> >>> >> On Thu, Dec 15, 2011 at 1:16 PM, Oleg Kalnichevski
<
> > >> olegk@apache.org
> > >> >>> >wrote:
> > >> >>> >>
> > >> >>> >>> On Thu, 2011-12-15 at 11:41 +0100, Lukáš Vlček
wrote:
> > >> >>> >>> > Hi,
> > >> >>> >>> >
> > >> >>> >>> > I tried it and it works. Thanks!
> > >> >>> >>> >
> > >> >>> >>> > However, still I am not sure if this fixed
everything.
> > >> >>> >>> > See the following commit in my test repo
on github (I added
> a new
> > >> >>> branch
> > >> >>> >>> > called "workaround")
> > >> >>> >>> >
> > >> >>> >>>
> > >> >>>
> > >>
> https://github.com/lukas-vlcek/mime4j-test/commit/385c66847bec4393ad67069fb367c174f87c5656
> > >> >>> >>> >
> > >> >>> >>> > As you can see the call to
>  message.getFrom().get(0).getName()
> > >> >>> returns
> > >> >>> >>> > expected data, but
> message.getHeader().getField("from").getBody()
> > >> >>> does
> > >> >>> >>> not.
> > >> >>> >>> > At least that is how I understand its JavaDoc:
> > >> >>> >>> >
> > >> >>> >>>
> > >> >>>
> > >>
> http://james.apache.org/mime4j/apidocs/org/apache/james/mime4j/stream/Field.html#getBody()
> > >> >>> >>> >
> > >> >>> >>> > "Gets the unparsed and possibly encoded
(see RFC 2047)
> field body
> > >> >>> >>> string."
> > >> >>> >>> >
> > >> >>> >>> > How should I understand the "encoded" in
this context?
> > >> >>> >>> >
> > >> >>> >>>
> > >> >>> >>> Encoded actually means, well, encoded, as specified
in RFC
> 2047.
> > >> >>> >>>
> > >> >>> >>> Oleg
> > >> >>> >>>
> > >> >>> >>>
> > >> >>> >>>
> > >> >>> >>
> > >> >>>
> > >> >>
> > >> >>
> > >>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message