incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremias Maerki <...@jeremias-maerki.ch>
Subject Re: ODF-> PDF (was: [odfdom-dev] Status of the Simple Java API for ODF and ODFDOM - 08/10/2011)
Date Tue, 23 Aug 2011 06:52:51 GMT
Some hopefully useful comments inline below... (a little late since I've
only just joined the mailing list now)

On 16.08.2011 03:14:05 Biao Han wrote:
> 
> 
> FYI. About an ODF->PDF convertor contribution.
> 
> Regards
> 
> Biao Han (Devin)
> SOA Standards Growth, Emerging Technology Institute(ETI), IBM China
> Software Development Laboratory
> Tel:(86-10)82450541
> Email: hanbiao@cn.ibm.com
> Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software Park,
> No. 8 Dong Bei Wang West Road, ShangDi, Haidian District, Beijing,
> P.R.C.100193
> ----- Forwarded by Biao Han/China/IBM on 2011-08-16 09:13 -----
> 
> From:	Biao Han/China/IBM
> To:	Angelo zerr <angelo.zerr@gmail.com>
> Cc:	dev@odfdom.odftoolkit.org, general@incubator.apache.org
> Date:	2011-08-12 17:41
> Subject:	Re: [odfdom-dev] Status of the Simple Java API for ODF and
>             ODFDOM - 08/10/2011
> 
> 
> Angelo zerr <angelo.zerr@gmail.com> wrote on 2011-08-12 17:26:24:
> 
> > From: Angelo zerr <angelo.zerr@gmail.com>
> > To: Biao Han/China/IBM@IBMCN
> > Cc: dev@odftoolkit.odftoolkit.org, dev@odfdom.odftoolkit.org,
> > general@incubator.apache.org, dev@simple.odftoolkit.org
> > Date: 2011-08-12 17:27
> > Subject: Re: [odfdom-dev] Status of the Simple Java API for ODF and
> > ODFDOM - 08/10/2011
> >
> > Hi Biao,
> 
> >
> > Thanks for your contribution intention. But we found iText uses the
> > AGPL license: http://itextpdf.com/terms-of-use/index.php
> > So it would be difficult to use that in an Apache 2.0 licensed project.
> >
> > Yes I thought that. That's very shame -(
> >
> >
> > Do you have plan to supply a version using PDFBox pr FOP? Both of
> > them will be OK for Apache 2.0 license.
> > And as far as I know, PDFBox may be easier. Its API is similar with
> iText.
> >
> > XDocReport uses iText because ODT->PDF processes like this :
> >
> > 1) load ODT with ODFDOM
> > 2) visit ODFDOM and generate iText structure PDF per ODFDOM structure
> >
> > Problem with FOP is that you must have XML FO to generate PDF. I
> > have tried to do that (without ODFDOM) with XSL-FO, but performance
> > are very bad (even with XSLT cache, use xsl:key to cache compute of
> > styles....). Perhaps it's possible to use FOP with pur Java (without
> > XML FO) but I have not found samples.

Generating XSL-FO from ODF is certainly something that has some benefit
on its own. But I wouldn't recommend to use XSL-FO when the goal is to
convert ODF to PDF.

But something else (as you suspected): Apache FOP has its own PDF
library which is highly optimized for writing PDFs with very little
memory consumption. It is also rather fast. And you don't need to use
XSL-FO. In contrast to Apache PDFBox, Apache FOP has a Graphics2D/Java2D
implementation (PDFGraphics2D and PDFDocumentGraphics2D) that can make
generating PDFs easier. The downside is that the PDF library itself is
not separately documented and you'd have to look into FOP's source code
for hints on how to use it. I can help with pointers if desired. For now,
I can recommend looking at PDFDocumentGraphics2D for hints on how to
create a PDF document with FOP's PDF library from scratch:
http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/svg/PDFDocumentGraphics2D.java?view=markup

And then I'd like to point out another possible direction, one that
would allow you not only to generate PDF but actually all of FOP's
supported output formats (PDF, PostScript, AFP, PNG/TIFF, PCL, SVG
etc.). FOP has this so-called Intermediate Format (IF) which is a
low-level representation of a set of rendered pages (i.e. after layout!).
The set of instructions is relatively easy and can be produced via Java
calls or an XML stream. I've attached a sample file that shows the IF
format in XML representation. Some information on the format is found
here:
http://xmlgraphics.apache.org/fop/1.0/intermediate.html#usage-if

The obvious advantage of using XSL-FO is that you don't have to write
your own layout engine for line breaking and stuff. But ODF also doesn't
map 1:1 to XSL-FO (page headers and footers work differently, for
example). The underlying concepts simply don't match.

With Graphics2D or FOP's IF format, you'll need at least some kind basic
layout engine to do line and page breaking, footnote handling etc. I
don't know how much of that iText took from you. And I don't know if
PDFBox could match iText in the layout department. And FOP's layout
engine is too FO-oriented to be any useful, except maybe for the basic
implementation of the Knuth line breaking algorithm was is abstracted to
a reasonable level.

Just pointing out possible routes. Obviously, you'll have to decide
which one fits best.

> > For PDFBox, I have never used. Do you think this library manage the
> > same thing than iText (Table, table row, images widget...) and with
> > the same peformance? I must study it to see if it's possible to
> > implement a new converter with PDFBox.
> Please reference the cookbook
> http://pdfbox.apache.org/userguide/cookbook.html
> Or we can request help from their user mail list
> http://pdfbox.apache.org/mail-lists.html#users
> >
> > Thank a lot for your information;,
> >
> > Regards Angelo
> >
> > Whatever, thank you for your eager contribution intention!
> >
> >
> > >
> > > Regards Angelo
> > >
> >
> > > 2011/8/10 Biao Han <hanbiao@cn.ibm.com>
> > > (We should send this to project mailing list, but we don't have one
> > > yet. so sorry for interrupt those guys in incubator general mailing
> list)
> > >
> > > ODF Toolkit move to Apache
> > > 1. SVN account has been created and is now available for use. We
> > > will discuss and start the code move after mail lists are ready;
> > > 2. The first board meeting is scheduled for Wed, 17 August 2011, 10
> > > am Pacific. We have submitted a quarterly board report to here.
> > > 3. As we have been an Apache incubator project, so we will discuss
> > > and release ODF Toolkit in the new community. The original release
> > > plan have to be cancelled.
> > >
> > > Simple ODF
> > > 1. Reviewed and pushed a bug about TextProperties (#bug 357).
> > > 2. Reviewed and pushed three unit test coverage enhancement patches
> > > (#bug 241) .
> > > 3. The downloads of Simple ODF 0.6.5 has been to 204. This number
> > > equals with Simple ODF 0.4. But version 0.4 uses more 6 months get
> > > it, while version 0.6.5 uses only 40 days.
> > >
> > > ODFDOM
> > > 1. Working on data signature. There are two issues caused by
> > > OpenOffice block the process.
> > > (1) OpenOffice.org generate a Namespace unaware signature document.
> > > ODFDOM loads it fails.
> > > (2) OpenOffice.org creates multiple X509Certificates instead of the
> > > correct certification chain under ds:KeyInfo.
> > > see also:
> > > https://bugs.freedesktop.org/show_bug.cgi?id=39657 (ds namespace in
> > > LibreOffice)
> > > http://openoffice.org/bugzilla/show_bug.cgi?id=107864 (ds namespace in
> OOo)
> > > http://openoffice.org/bugzilla/show_bug.cgi?id=66276 (multiple
> > > X509Certificate in OOo)
> > > http://openoffice.org/bugzilla/show_bug.cgi?id=108286
> > > We have to supply two modes to fix it. One follows ODF
> > > specification, the other follows Open Office. The question is which
> > > is the default?
> > > 2. A new user: XDocReport uses ODFDOM to load and manipulate ODF
> > > document. It's Java API to merge XML document created with MS Office
> > > (docx) or OpenOffice (odt), LibreOffice (odt) with a Java model to
> > > generate report and convert it if you need to another format (PDF,
> > XHTML...).
> > > Regards
> > >
> > > Biao Han (Devin)
> > > SOA Standards Growth, Emerging Technology Institute(ETI), IBM China
> > > Software Development Laboratory
> > > Tel:(86-10)82450541
> > > Email: hanbiao@cn.ibm.com
> > > Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software
> > > Park, No. 8 Dong Bei Wang West Road, ShangDi, Haidian District,
> > > Beijing, P.R.C.100193




Jeremias Maerki

Mime
  • Unnamed multipart/mixed (inline, 7-Bit, 0 bytes)
View raw message