james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Charles <e...@apache.org>
Subject Re: Headless mail renderer
Date Mon, 24 Jan 2011 17:16:34 GMT

I also used java/mozilla integration via javaxpcom which needs 
investment from developer (API changes,...). An alternative is to use an 
html to pdf add-on and call it from xul with a java/xulrunner integration.
I also used Flying Saucer but didn't know it was able to generate PDF.
For your use case, there's also the openoffice SDK which is really well 
documented and supports a wide range of input/output document format 
(html, pdf,...).



On 24/01/2011 15:09, Noss Benoit wrote:
> thanks for your comments Stefano, I will look in the directions you 
> suggested and keep you informed (if you want to)
> BenoƮt
> On 24.01.2011 11:57, Stefano Bagnara wrote:
>> 2011/1/24 Noss Benoit<benoit.noss@secu.lu>:
>>> Hi Stefano,
>>> thanks for your answer. In the past, I already tried to do this with 
>>> the
>>> javax.mail.Message class.
>>> it was not a big success..., and found lots of issues due to the 
>>> variety of
>>> incoming mails, so couldn't get in production.
>> You can tweak javamail with some system property to let it parse some
>> more malformed message.
>> I say this because I think javamail is ok for this work, too.
>> Mime4j may be a little simpler, but I'm not sure it worth porting your
>> code if you already have javamail code ready.
>> With both you will have anyway to manually deal with mime parts and
>> decide what to do with each part (mime4j removes the complexity of the
>> activation framework and automatic object decoding done by javamail).
>>> With each parsed Message, I tried to build in parallel a xhtml page
>>> representing its content (From: To: Subject: Date: and body content)
>>> When the attachement was a message, I recursively went into it and 
>>> appended
>>> info found in the xhtml I previously created
>>> When I found html, I tried to transform it to XHTML with tidy, then 
>>> to PDF
>>> with iText
>>>                                     when XHTML transformation failed 
>>> and had
>>> a multipart/alternative, I then rendered txt to PDF
>>> When I found attached images, I rendered them to PDF
>>> When I found office documents I didn't transform them
>>> After that I merged all created PDF in one big PDF and checked it in to
>>> Documentum DB (for one message, one pdf)
>> For xhtml to pdf rendering you may want to evaluate xhtmlrenderer (aka
>> Flying Saucer).
>> It is the best pure java xhtml renderer out there: it is not near to
>> real web browsers but much better than other java rendering I tested.
>>> The aim of the project is not to have a pretty rendering of all 
>>> mail, it's
>>> just to keep track of messages our client sent.
>>> I faced three big issues :
>>> **************************
>>> 0/ multipart/mixed with inline image content in "cid:...."
>> Sure, you have to do manual work with this. Look for parts with
>> Content-ID and alter references in the html urls to link to this
>> objects.
>> Depending on your rendering engine you should be able to plug your own
>> url resolver and intercept cid: urls to provide the streams from the
>> appropriate mime parts (I do that using Flying Sourcer)
>>> 1/ like you said html to pdf rendering is difficult and (tidy+iText or
>>> multipart/alternative) was not always working.
>>>     If only I could use the Mozilla components to render it, but my
>>> understanding of it is not high enough
>> You can use mozilla components or even webkit: just google and you
>> will find informations. I preferred Flying Saucer because I don't want
>> to run X (even xvfb) on my servers for this task.
>>> 2/ Special caracters and encoding pb in headers and attached file names
>> I've had issues only with oriental encodings: they are difficult to
>> support in flying saucer. No problems with european encodings.
>> Stefano

View raw message