james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bagnara <apa...@bago.org>
Subject Re: Headless mail renderer
Date Tue, 25 Jan 2011 09:30:58 GMT
2011/1/25 Noss Benoit <benoit.noss@secu.lu>:
> Hi, after your comments, I know think I have to split my project in two
> parts
>
> 1/ The first part has to parse the message and write an html or xhtml page
> representing the output I want for the message
> 2/ The second part has to render the html I precedently generated to PDF

I do that in a single step because of the content-id "cid:" image references.
BTW logically you need to separate components: parser and renderer.

> I tried flying saucer in the past, it can generate PDF, but it needed strict
> XHTML for the input, and lots of mails are not strict XHTML

I've had very good results parsing the html with validator.nu parser:
http://about.validator.nu/htmlparser/

I parsed thousands of HTML email and tested most html parser out there
and validator.nu was the only one parsing them all.

> On the one hand, I think I can improve my parser to get the html I want for
> most of the mails I have to transform.
> On the other hand, I don't know the openoffice SDK, webkit and Mozilla, and
> html rendering will be the hardest part....

If you used flying saucer in past then go ahead with that.

Stefano

Mime
View raw message