corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian C <>
Subject Re: ODF to HTML
Date Tue, 02 Jun 2015 11:44:59 GMT
Thanks Peter,

On Tue, Jun 2, 2015 at 12:43 AM, Peter Kelly <> wrote:

> (apologies if this is a duplicate - I sent from the wrong email address
> before)
> > On 31 May 2015, at 7:09 pm, Ian C < <>>
> wrote:
> >
> > We also need to take into account the style hierarchy. I see from some
> of the CSS documentation that there are mechanisms in place to manage that
> but have not looked in detail. Any advice Peter?
> First, some general comments - what I recommend is to first build up a
> custom data structure representing all the styles which can later be
> queried when needed, e.g. when you encounter an element in content.xml that
> has a particular style associated with it.
> In the Word filter, there are two classes used for this purpose: WordSheet
> and WordStyle (the former being a collection of the latter). These are
> defined in WordSheet.h and WordSheet.c. Early in the conversion process,
> the filter goes through the XML document containing the styles and builds
> up this data structure. This results in the code being able to deal with
> the styles at a higher-level of abstraction than examining the DOM tree of
> styles.xml directly.
> A while ago I made a start on the same thing for ODF - there’s ODFSheet
> and ODFStyle classes defined for the same purpose. So a good next step for
> tackling styles would be to traverse the DOM tree of styles.xml and
> populate this data structure, creating a new ODFStyle object for each style
> in the document, and adding them to the (single) ODFSheet object for the
> document. This data structure could then be used to generate the CSS text,
> as is done in the Word filter.
> > I just generated to div tags do we want that? Mapping to h1... hn could
> be a better way but not sure how to really map the correct heading styles
> to the hn.
> In the case of ODF, the information about what header to map to is
> (usually) available more directly than in OOXML. Both specs refer to it as
> the “outline level”. In an ODF document, heading outline levels start from
> 1 (just like HTML), but you also have the distinction between <text:h> and
> <text:p> elements, so you can know whether something is a heading or a
> regular paragraph.
> When encountering a <text:h> element, you can determine the outline level
> from the attribute, e.g.:
> <text:h text:style-name="Heading_20_1" text:outline-level="1">Headline
> One</text:h>
> So here the value ‘1’ is sufficient information to indicate that you need
> to create a h1 element. The style-name attribute is Heading_20_1, so the
> corresponding CSS would need to be:
> h1.Heading_20_1 {
> }
> and similarly for other levels, e.g.
> h2.Heading_20_2 {
> }
> Note that, as with your existing code, this would be generated separately
> from the content itself, solely based on the information in styles.xml, for
> the non-automatic styles.
> So I suggest separating buildCSS_Styles into two separate functions: One
> which populates the CSSSheet object associated with the package (that is,
> package->sheet, which I think is already created), and another which
> examines the ODFSheet object and populates the CSSSheet object.
> I will try to make it so.

Re inheritance I saw some documentation referring to* "we use the
special -uxwrite-parent** " *
I was trying to figure out how that can/should be applied to the
inheritance in the styles of a document.

Not sure if you saw that email about the tool I created but one of the
things it does is show the styles in an odt document and which ones are
actually used.
Often many more styles are defined than are used. Something I struggle with
but understand the argument for.

The base properties, font etc are inherited and a derived new style just
changes say the font-weight. Should we walk up the try and make a
definitive css style, or used css's built in inheritance in some way. I
suspect so, just not sure how yet and whether the -uxwrite-parent is meant
to help.

And where there are more styles defined in a document than are used, should
we add them to the css so they are still available in a round trip of odt
to html , html to odt. And have the final odt have the original style

I will take a look. I should be working on other things but am getting
hooked. So will try to take it to a point we can build on.

> Dr Peter M. Kelly
> PGP key: <>
> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)


Ian C

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message