Return-Path: Delivered-To: apmail-james-mime4j-dev-archive@minotaur.apache.org Received: (qmail 74982 invoked from network); 24 Jan 2011 17:17:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Jan 2011 17:17:04 -0000 Received: (qmail 18993 invoked by uid 500); 24 Jan 2011 17:17:04 -0000 Delivered-To: apmail-james-mime4j-dev-archive@james.apache.org Received: (qmail 18943 invoked by uid 500); 24 Jan 2011 17:17:03 -0000 Mailing-List: contact mime4j-dev-help@james.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mime4j-dev@james.apache.org Delivered-To: mailing list mime4j-dev@james.apache.org Received: (qmail 18930 invoked by uid 99); 24 Jan 2011 17:17:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Jan 2011 17:17:02 +0000 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=RCVD_IN_RP_RNBL,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [81.188.29.42] (HELO bxlexch.b2boost.local) (81.188.29.42) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Jan 2011 17:16:55 +0000 Received: from [192.168.100.34] (192.168.100.2) by bxlexch.b2boost.local (192.168.100.253) with Microsoft SMTP Server id 8.2.176.0; Mon, 24 Jan 2011 18:16:34 +0100 Message-ID: <4D3DB3F2.2010308@apache.org> Date: Mon, 24 Jan 2011 18:16:34 +0100 From: Eric Charles User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b10pre) Gecko/20110124 Thunderbird/3.3a3pre MIME-Version: 1.0 To: mime4j-dev@james.apache.org Subject: Re: Headless mail renderer References: <4D3D19CB.4010602@secu.lu> <4D3D578E.5050607@secu.lu> <4D3D8825.3050906@secu.lu> In-Reply-To: <4D3D8825.3050906@secu.lu> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, fyi I also used java/mozilla integration via javaxpcom which needs investment from developer (API changes,...). An alternative is to use an html to pdf add-on and call it from xul with a java/xulrunner integration. I also used Flying Saucer but didn't know it was able to generate PDF. For your use case, there's also the openoffice SDK which is really well documented and supports a wide range of input/output document format (html, pdf,...). Tks, Eric On 24/01/2011 15:09, Noss Benoit wrote: > thanks for your comments Stefano, I will look in the directions you > suggested and keep you informed (if you want to) > > Beno�t > > > On 24.01.2011 11:57, Stefano Bagnara wrote: >> 2011/1/24 Noss Benoit: >>> Hi Stefano, >>> thanks for your answer. In the past, I already tried to do this with >>> the >>> javax.mail.Message class. >>> it was not a big success..., and found lots of issues due to the >>> variety of >>> incoming mails, so couldn't get in production. >> You can tweak javamail with some system property to let it parse some >> more malformed message. >> I say this because I think javamail is ok for this work, too. >> Mime4j may be a little simpler, but I'm not sure it worth porting your >> code if you already have javamail code ready. >> >> With both you will have anyway to manually deal with mime parts and >> decide what to do with each part (mime4j removes the complexity of the >> activation framework and automatic object decoding done by javamail). >> >>> With each parsed Message, I tried to build in parallel a xhtml page >>> representing its content (From: To: Subject: Date: and body content) >>> When the attachement was a message, I recursively went into it and >>> appended >>> info found in the xhtml I previously created >>> When I found html, I tried to transform it to XHTML with tidy, then >>> to PDF >>> with iText >>> when XHTML transformation failed >>> and had >>> a multipart/alternative, I then rendered txt to PDF >>> When I found attached images, I rendered them to PDF >>> When I found office documents I didn't transform them >>> After that I merged all created PDF in one big PDF and checked it in to >>> Documentum DB (for one message, one pdf) >> For xhtml to pdf rendering you may want to evaluate xhtmlrenderer (aka >> Flying Saucer). >> It is the best pure java xhtml renderer out there: it is not near to >> real web browsers but much better than other java rendering I tested. >> >>> The aim of the project is not to have a pretty rendering of all >>> mail, it's >>> just to keep track of messages our client sent. >>> >>> I faced three big issues : >>> ************************** >>> 0/ multipart/mixed with inline image content in "cid:...." >> Sure, you have to do manual work with this. Look for parts with >> Content-ID and alter references in the html urls to link to this >> objects. >> Depending on your rendering engine you should be able to plug your own >> url resolver and intercept cid: urls to provide the streams from the >> appropriate mime parts (I do that using Flying Sourcer) >> >>> 1/ like you said html to pdf rendering is difficult and (tidy+iText or >>> multipart/alternative) was not always working. >>> If only I could use the Mozilla components to render it, but my >>> understanding of it is not high enough >> You can use mozilla components or even webkit: just google and you >> will find informations. I preferred Flying Saucer because I don't want >> to run X (even xvfb) on my servers for this task. >> >>> 2/ Special caracters and encoding pb in headers and attached file names >> I've had issues only with oriental encodings: they are difficult to >> support in flying saucer. No problems with european encodings. >> >> Stefano > > > > >