james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukáš Vlček <lukas.vl...@gmail.com>
Subject Re: Switching from 0.6 to 0.7.1 - address parsing, CRLF issues
Date Mon, 19 Dec 2011 10:21:39 GMT
Thanks,

this helped me.

Regards,
Lukas

On Sun, Dec 18, 2011 at 1:22 PM, Oleg Kalnichevski <olegk@apache.org> wrote:

> On Sun, 2011-12-18 at 12:57 +0100, Stefano Bagnara wrote:
> > 2011/12/18 Lukáš Vlček <lukas.vlcek@gmail.com>:
> > > Hi again,
> > >
> > > there is another difference between 0.6 and 0.7.1
> > >
> > > It happens when I need to parse message with weird charset 'x-gbk'.
> > >
> > > I added a new test case into my github repo for both 0.6 and 0.7.1
> versions.
> > > Basically the problem happens when I try to get Reader from the
> TextBody:
> > >
> > > Message message = getMessage(....);
> > > TextBody body = (TextBody) message.getBody();
> > > body.getReader(); // here the exception is fired
> > >
> > > It worked fine in 0.6 but it yields UnsupportedEncodingException in
> 0.7.1
> > > which is quite unfortunate because I do not see way how to extract body
> > > content from it in this case (as I need to get the Reader first).
> > >
> > > Here are examples:
> > > 0.6
> > >
> https://github.com/lukas-vlcek/mime4j-test/blob/backto06/src/test/java/org/mime4j/test/BasicTest.java#L134
> > > 0.7.1
> > >
> https://github.com/lukas-vlcek/mime4j-test/blob/workaround/src/test/java/org/mime4j/test/BasicTest.java#L134
> > >
> > > Any idea hot to deal with this situation?
> > >
> > > Note, I understand that 'x-gbk' is unknown charset for JVM but even in
> such
> > > case I was able to extract the body content by forcing 'gbk' charset
> > > instead into my utility classes in 0.6, however, with 0.7.1 I am
> unable to
> > > do it because the very basic body.getReader call fails... is there any
> > > workaround how to get the body Reader and not to get the exception?
> >
> > Please open a JIRA issue, attach your test case and the stack trace of
> > the exception you get.
> > I guess we can classify this as a bug because we want to give users a
> > way to deal with malformed stuff, so also unrecognized charsets.
> >
> > I don't know if we already have an existing workaround but the JIRA
> > issue will anyway help keeping track of the issue for future users.
> >
> > Stefano
>
>
> There's no need to raise a JIRA, because basically there is no issue.
> #getReader is merely a convenience method. One can always use
> #getInputStream() to get raw data stream and apply whatever charset
> encoding.
>
> Lukáš, what is wrong with that?
>
> ---
> Message message =
> ParserUtil.getMessage(getInputStream("mbox/jboss-as7-dev-01.mbox"));
>
> assertTrue(message.getBody() instanceof TextBody);
> TextBody body = (TextBody) message.getBody();
>
> String charset = message.getCharset();
> if (charset.equalsIgnoreCase("x-gbk")) {
>    charset = "gbk";
> }
> Reader reader = new InputStreamReader(body.getInputStream(), charset);
> ---
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message