james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Norman Maurer <nor...@apache.org>
Subject Re: parsing mbox fiels with mime4j
Date Thu, 03 Jun 2010 16:05:01 GMT
Mime4J need to get used with one InputStream per message. So you would
need to split the mbox file.

Bye,
Norman


2010/6/3 Johannes Zillmann <jzillmann@googlemail.com>:
> Hi,
>
> i'm trying to parse this mbox file http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200602
with mime4j with 0.6 version.
> The parsing code is like this:
> --------------------------
> org.apache.james.mime4j.parser.MimeTokenStream stream = new MimeTokenStream();
> BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream("/Users/jz/Documents/workspace/ms/dap/modules/dap-conductor/src/data/mbox/200602"));
> while (bufferedInputStream.available() > 0) {
>     stream.parse(bufferedInputStream);
>     handleParse(stream);
>     System.out.println("---------------------------------------------");
> }
> --------------------------
>
> Some messages seems to be parsed correctly, but sometime the parser ends a message in
the middle of a body and starts the next one.
>
> A mid of a body:
> --------------------------
> Context.java:266)
>        at
> org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContex
> t.java:449)
>        at org.mortbay.util.Container.start(Container.java:72)
>        at org.mortbay.http.HttpServer.doStart(HttpServer.java:753)
>        at org.mortbay.util.Container.start(Container.java:72)
>        at
> org.apache.hadoop.mapred.JobTrackerInfoServer$HTTPStarter.run(JobTrackerInfo
> Server.java:101)
> --------------------------
>
> The next field:
> --------------------------
> FIELD: ainer.start(Container.java:      72)
>        at org.mortbay.http.HttpServer.doStart(HttpServer.java:753)
>        at org.mortbay.util.Container.start(Container.java:72)
>        at
> --------------------------
>
> Is mime4j apropriate to parse mbox format ? Is there any configuration or trick which
can help me here ?
>
> best regards
> Johannes
>
>

Mime
View raw message