corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kelly <>
Subject Re: ODF to HTML
Date Mon, 01 Jun 2015 16:43:35 GMT
(apologies if this is a duplicate - I sent from the wrong email address before)

> On 31 May 2015, at 7:09 pm, Ian C < <>>
> We also need to take into account the style hierarchy. I see from some of the CSS documentation
that there are mechanisms in place to manage that but have not looked in detail. Any advice

First, some general comments - what I recommend is to first build up a custom data structure
representing all the styles which can later be queried when needed, e.g. when you encounter
an element in content.xml that has a particular style associated with it.

In the Word filter, there are two classes used for this purpose: WordSheet and WordStyle (the
former being a collection of the latter). These are defined in WordSheet.h and WordSheet.c.
Early in the conversion process, the filter goes through the XML document containing the styles
and builds up this data structure. This results in the code being able to deal with the styles
at a higher-level of abstraction than examining the DOM tree of styles.xml directly.

A while ago I made a start on the same thing for ODF - there’s ODFSheet and ODFStyle classes
defined for the same purpose. So a good next step for tackling styles would be to traverse
the DOM tree of styles.xml and populate this data structure, creating a new ODFStyle object
for each style in the document, and adding them to the (single) ODFSheet object for the document.
This data structure could then be used to generate the CSS text, as is done in the Word filter.

> I just generated to div tags do we want that? Mapping to h1... hn could be a better way
but not sure how to really map the correct heading styles to the hn.

In the case of ODF, the information about what header to map to is (usually) available more
directly than in OOXML. Both specs refer to it as the “outline level”. In an ODF document,
heading outline levels start from 1 (just like HTML), but you also have the distinction between
<text:h> and <text:p> elements, so you can know whether something is a heading
or a regular paragraph.

When encountering a <text:h> element, you can determine the outline level from the attribute,

<text:h text:style-name="Heading_20_1" text:outline-level="1">Headline One</text:h>

So here the value ‘1’ is sufficient information to indicate that you need to create a
h1 element. The style-name attribute is Heading_20_1, so the corresponding CSS would need
to be:

h1.Heading_20_1 {

and similarly for other levels, e.g.

h2.Heading_20_2 {

Note that, as with your existing code, this would be generated separately from the content
itself, solely based on the information in styles.xml, for the non-automatic styles.

So I suggest separating buildCSS_Styles into two separate functions: One which populates the
CSSSheet object associated with the package (that is, package->sheet, which I think is
already created), and another which examines the ODFSheet object and populates the CSSSheet

Dr Peter M. Kelly

PGP key: <>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message