incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Fisher <dave2w...@comcast.net>
Subject Re: ODF-> PDF (was: [odfdom-dev] Status of the Simple Java API for ODF and ODFDOM - 08/10/2011)
Date Tue, 23 Aug 2011 15:24:05 GMT

On Aug 22, 2011, at 11:52 PM, Jeremias Maerki wrote:

> Some hopefully useful comments inline below... (a little late since I've
> only just joined the mailing list now)

Not late at all. Some of us are just starting with the ODF Toolkit.

Yegor and I have done quite a bit of work in Apache POi with PPT / PPTX rendering. Our main
use is from PS. We comment our PS with layout information. It is not a general format it is
from a proprietary layout system that I have developed and maintained for over 30 years -
originally inspired by references to TeK, Metafont, XICS, and Interpress (Knuth on one side
and Warnock on the other.)

At quick glance there is a good fit with the Intermediate format.

I'd like to checkout the Knuth line breaking algorithm. This is something critical that MS
has never understood. For example centering a title that is just a little too wide for the
width. In a Knuth based scoring system you would split near the middle and in PPT you break
at the first spot that works.

I'll introduce more fully later.

Regards,
Dave


> 
> On 16.08.2011 03:14:05 Biao Han wrote:
>> 
>> 
>> FYI. About an ODF->PDF convertor contribution.
>> 
>> Regards
>> 
>> Biao Han (Devin)
>> SOA Standards Growth, Emerging Technology Institute(ETI), IBM China
>> Software Development Laboratory
>> Tel:(86-10)82450541
>> Email: hanbiao@cn.ibm.com
>> Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software Park,
>> No. 8 Dong Bei Wang West Road, ShangDi, Haidian District, Beijing,
>> P.R.C.100193
>> ----- Forwarded by Biao Han/China/IBM on 2011-08-16 09:13 -----
>> 
>> From:	Biao Han/China/IBM
>> To:	Angelo zerr <angelo.zerr@gmail.com>
>> Cc:	dev@odfdom.odftoolkit.org, general@incubator.apache.org
>> Date:	2011-08-12 17:41
>> Subject:	Re: [odfdom-dev] Status of the Simple Java API for ODF and
>>            ODFDOM - 08/10/2011
>> 
>> 
>> Angelo zerr <angelo.zerr@gmail.com> wrote on 2011-08-12 17:26:24:
>> 
>>> From: Angelo zerr <angelo.zerr@gmail.com>
>>> To: Biao Han/China/IBM@IBMCN
>>> Cc: dev@odftoolkit.odftoolkit.org, dev@odfdom.odftoolkit.org,
>>> general@incubator.apache.org, dev@simple.odftoolkit.org
>>> Date: 2011-08-12 17:27
>>> Subject: Re: [odfdom-dev] Status of the Simple Java API for ODF and
>>> ODFDOM - 08/10/2011
>>> 
>>> Hi Biao,
>> 
>>> 
>>> Thanks for your contribution intention. But we found iText uses the
>>> AGPL license: http://itextpdf.com/terms-of-use/index.php
>>> So it would be difficult to use that in an Apache 2.0 licensed project.
>>> 
>>> Yes I thought that. That's very shame -(
>>> 
>>> 
>>> Do you have plan to supply a version using PDFBox pr FOP? Both of
>>> them will be OK for Apache 2.0 license.
>>> And as far as I know, PDFBox may be easier. Its API is similar with
>> iText.
>>> 
>>> XDocReport uses iText because ODT->PDF processes like this :
>>> 
>>> 1) load ODT with ODFDOM
>>> 2) visit ODFDOM and generate iText structure PDF per ODFDOM structure
>>> 
>>> Problem with FOP is that you must have XML FO to generate PDF. I
>>> have tried to do that (without ODFDOM) with XSL-FO, but performance
>>> are very bad (even with XSLT cache, use xsl:key to cache compute of
>>> styles....). Perhaps it's possible to use FOP with pur Java (without
>>> XML FO) but I have not found samples.
> 
> Generating XSL-FO from ODF is certainly something that has some benefit
> on its own. But I wouldn't recommend to use XSL-FO when the goal is to
> convert ODF to PDF.
> 
> But something else (as you suspected): Apache FOP has its own PDF
> library which is highly optimized for writing PDFs with very little
> memory consumption. It is also rather fast. And you don't need to use
> XSL-FO. In contrast to Apache PDFBox, Apache FOP has a Graphics2D/Java2D
> implementation (PDFGraphics2D and PDFDocumentGraphics2D) that can make
> generating PDFs easier. The downside is that the PDF library itself is
> not separately documented and you'd have to look into FOP's source code
> for hints on how to use it. I can help with pointers if desired. For now,
> I can recommend looking at PDFDocumentGraphics2D for hints on how to
> create a PDF document with FOP's PDF library from scratch:
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/svg/PDFDocumentGraphics2D.java?view=markup
> 
> And then I'd like to point out another possible direction, one that
> would allow you not only to generate PDF but actually all of FOP's
> supported output formats (PDF, PostScript, AFP, PNG/TIFF, PCL, SVG
> etc.). FOP has this so-called Intermediate Format (IF) which is a
> low-level representation of a set of rendered pages (i.e. after layout!).
> The set of instructions is relatively easy and can be produced via Java
> calls or an XML stream. I've attached a sample file that shows the IF
> format in XML representation. Some information on the format is found
> here:
> http://xmlgraphics.apache.org/fop/1.0/intermediate.html#usage-if
> 
> The obvious advantage of using XSL-FO is that you don't have to write
> your own layout engine for line breaking and stuff. But ODF also doesn't
> map 1:1 to XSL-FO (page headers and footers work differently, for
> example). The underlying concepts simply don't match.
> 
> With Graphics2D or FOP's IF format, you'll need at least some kind basic
> layout engine to do line and page breaking, footnote handling etc. I
> don't know how much of that iText took from you. And I don't know if
> PDFBox could match iText in the layout department. And FOP's layout
> engine is too FO-oriented to be any useful, except maybe for the basic
> implementation of the Knuth line breaking algorithm was is abstracted to
> a reasonable level.
> 
> Just pointing out possible routes. Obviously, you'll have to decide
> which one fits best.
> 
>>> For PDFBox, I have never used. Do you think this library manage the
>>> same thing than iText (Table, table row, images widget...) and with
>>> the same peformance? I must study it to see if it's possible to
>>> implement a new converter with PDFBox.
>> Please reference the cookbook
>> http://pdfbox.apache.org/userguide/cookbook.html
>> Or we can request help from their user mail list
>> http://pdfbox.apache.org/mail-lists.html#users
>>> 
>>> Thank a lot for your information;,
>>> 
>>> Regards Angelo
>>> 
>>> Whatever, thank you for your eager contribution intention!
>>> 
>>> 
>>>> 
>>>> Regards Angelo
>>>> 
>>> 
>>>> 2011/8/10 Biao Han <hanbiao@cn.ibm.com>
>>>> (We should send this to project mailing list, but we don't have one
>>>> yet. so sorry for interrupt those guys in incubator general mailing
>> list)
>>>> 
>>>> ODF Toolkit move to Apache
>>>> 1. SVN account has been created and is now available for use. We
>>>> will discuss and start the code move after mail lists are ready;
>>>> 2. The first board meeting is scheduled for Wed, 17 August 2011, 10
>>>> am Pacific. We have submitted a quarterly board report to here.
>>>> 3. As we have been an Apache incubator project, so we will discuss
>>>> and release ODF Toolkit in the new community. The original release
>>>> plan have to be cancelled.
>>>> 
>>>> Simple ODF
>>>> 1. Reviewed and pushed a bug about TextProperties (#bug 357).
>>>> 2. Reviewed and pushed three unit test coverage enhancement patches
>>>> (#bug 241) .
>>>> 3. The downloads of Simple ODF 0.6.5 has been to 204. This number
>>>> equals with Simple ODF 0.4. But version 0.4 uses more 6 months get
>>>> it, while version 0.6.5 uses only 40 days.
>>>> 
>>>> ODFDOM
>>>> 1. Working on data signature. There are two issues caused by
>>>> OpenOffice block the process.
>>>> (1) OpenOffice.org generate a Namespace unaware signature document.
>>>> ODFDOM loads it fails.
>>>> (2) OpenOffice.org creates multiple X509Certificates instead of the
>>>> correct certification chain under ds:KeyInfo.
>>>> see also:
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=39657 (ds namespace in
>>>> LibreOffice)
>>>> http://openoffice.org/bugzilla/show_bug.cgi?id=107864 (ds namespace in
>> OOo)
>>>> http://openoffice.org/bugzilla/show_bug.cgi?id=66276 (multiple
>>>> X509Certificate in OOo)
>>>> http://openoffice.org/bugzilla/show_bug.cgi?id=108286
>>>> We have to supply two modes to fix it. One follows ODF
>>>> specification, the other follows Open Office. The question is which
>>>> is the default?
>>>> 2. A new user: XDocReport uses ODFDOM to load and manipulate ODF
>>>> document. It's Java API to merge XML document created with MS Office
>>>> (docx) or OpenOffice (odt), LibreOffice (odt) with a Java model to
>>>> generate report and convert it if you need to another format (PDF,
>>> XHTML...).
>>>> Regards
>>>> 
>>>> Biao Han (Devin)
>>>> SOA Standards Growth, Emerging Technology Institute(ETI), IBM China
>>>> Software Development Laboratory
>>>> Tel:(86-10)82450541
>>>> Email: hanbiao@cn.ibm.com
>>>> Address: 3/F Ring Building, No.28 Building, Zhong Guan Cun Software
>>>> Park, No. 8 Dong Bei Wang West Road, ShangDi, Haidian District,
>>>> Beijing, P.R.C.100193
> 
> 
> 
> 
> Jeremias Maerki


Mime
View raw message