httpd-docs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Malo>
Subject Re: PDF transforms (was: PDF transforms, was Re: Stop shipping XML)
Date Tue, 31 Dec 2002 22:54:00 GMT
* Erik Abele wrote:

> André Malo wrote:

>> The first step was to learn xsl-fo and the limitations of fop (*sigh*).
>> The current stage consists of a pdf file per document - optimized for
>> print. There are just some final nits, that I'm currently picking.

> cool...I'm keen on seeing the first pages ... will have more time the next
> days and would really like to help picking out some nits :)


ok, I think the current stuff is now applicable (but requires some further 
work :)
You can get an impression at <>.

All of our XML source files got a PDF pendant _optimized for print_.
The PDF files don't contain any clickable links or other online reading 
stuff. The layout is more or less obtained from the manual-print.css with 
some enhancements that are not possible with pure CSS.
Instead of making links clickable, which is not useful for printing ;-), I 
decided to extract the relevant URLs from the particular href-attributes 
and put them as footnotes there (issue 4, see below).

However, the PDF stuff has a lot of implications:

1) including appropriate links into the corresponding HTML files requires 
metafiles, I proposed also for the language links. (we need the filename of 
the pdf). (I combined it on the example page, of course ;-)

2) href-extracting cripples out the given URLs and makes them absolute. 
This requires knowledge about the current (from the view of the document) 
path. Also solved by the metafiles.

3) For non-latin scripts we cannot use the standard PDF fonts, we have to 
embed other. My generated PDFs currently use Unicode Times and Courier from 
my Win2k-installation for Russian PDFs. For japanese I'm using currently 
MSMincho from the japanese language pack. But I wasn't able to use bold or 
italic variants. I also don't know, what monospace font is applicable for 
japanese. Hope, I'll get some hints here :)
Font embedding in the current variant has also some general drawbacks:
  - you cannot c&p from the non-latin pdfs, since there are no characters 
    stored rather than only *references to glyphs*. This has to be solved 
    anyway for a all-in-one pdf.
  - I'm not sure about license issues. The TTFReader of fop says "no 
    restrictions", but who knows? It would be better in general, to use 
    some free fonts, I think, that we can put into CVS or so.
  - the build system is currently somewhat specialized, since fop has some 
    serious bugs with path names etc. (needed the latest beta, to make it
    work in general! *sigh*)
    The whole pdf-build system needs some cleanup.

4) Footnote support of fop is buggy (produces sometimes notes overlapping 
with regular content etc.), so I decided to put them into an extra section 
which appears at last on the document. (look at a sample pdf file, if you 
don't understand, what I mean).

5) Table support is limited. fop doesn't support automatic table layout, so 
we have to manage that manually. Not such a problem, I think, since once 
created, the tablelayout file will be touched very seldom.
I put the tabledefinitions of all xml-files in one file per language.

6) fop doesn't support a lot of useful things, keep-conditions etc. But I 
can live with it, until it's implemented. (for example, sometimes headings 
appear on bottom of one page and the text follows on the subsequent 

However, I think, I'm missing a lot of stuff in this description, that I 
can't remember now, will post it later then ;-)

comments, help, questions is welcome :)

The xsl stuff can be found at <>. 
The build stuff at <> (including the fop 

wishing you all a happy new year, etc.

Treat your password like your toothbrush. Don't let anybody else
use it, and get a new one every six months.  -- Clifford Stoll

                                    (found in ssl_engine_pphrase.c)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message