james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noss Benoit <benoit.n...@secu.lu>
Subject Re: Headless mail renderer
Date Mon, 24 Jan 2011 14:09:41 GMT
thanks for your comments Stefano, I will look in the directions you 
suggested and keep you informed (if you want to)


On 24.01.2011 11:57, Stefano Bagnara wrote:
> 2011/1/24 Noss Benoit<benoit.noss@secu.lu>:
>> Hi Stefano,
>> thanks for your answer. In the past, I already tried to do this with the
>> javax.mail.Message class.
>> it was not a big success..., and found lots of issues due to the variety of
>> incoming mails, so couldn't get in production.
> You can tweak javamail with some system property to let it parse some
> more malformed message.
> I say this because I think javamail is ok for this work, too.
> Mime4j may be a little simpler, but I'm not sure it worth porting your
> code if you already have javamail code ready.
> With both you will have anyway to manually deal with mime parts and
> decide what to do with each part (mime4j removes the complexity of the
> activation framework and automatic object decoding done by javamail).
>> With each parsed Message, I tried to build in parallel a xhtml page
>> representing its content (From: To: Subject: Date: and body content)
>> When the attachement was a message, I recursively went into it and appended
>> info found in the xhtml I previously created
>> When I found html, I tried to transform it to XHTML with tidy, then to PDF
>> with iText
>>                                     when XHTML transformation failed and had
>> a multipart/alternative, I then rendered txt to PDF
>> When I found attached images, I rendered them to PDF
>> When I found office documents I didn't transform them
>> After that I merged all created PDF in one big PDF and checked it in to
>> Documentum DB (for one message, one pdf)
> For xhtml to pdf rendering you may want to evaluate xhtmlrenderer (aka
> Flying Saucer).
> It is the best pure java xhtml renderer out there: it is not near to
> real web browsers but much better than other java rendering I tested.
>> The aim of the project is not to have a pretty rendering of all mail, it's
>> just to keep track of messages our client sent.
>> I faced three big issues :
>> **************************
>> 0/ multipart/mixed with inline image content in "cid:...."
> Sure, you have to do manual work with this. Look for parts with
> Content-ID and alter references in the html urls to link to this
> objects.
> Depending on your rendering engine you should be able to plug your own
> url resolver and intercept cid: urls to provide the streams from the
> appropriate mime parts (I do that using Flying Sourcer)
>> 1/ like you said html to pdf rendering is difficult and (tidy+iText or
>> multipart/alternative) was not always working.
>>     If only I could use the Mozilla components to render it, but my
>> understanding of it is not high enough
> You can use mozilla components or even webkit: just google and you
> will find informations. I preferred Flying Saucer because I don't want
> to run X (even xvfb) on my servers for this task.
>> 2/ Special caracters and encoding pb in headers and attached file names
> I've had issues only with oriental encodings: they are difficult to
> support in flying saucer. No problems with european encodings.
> Stefano


View raw message