corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kelly <>
Subject Re: [Document Model] Initial questions about web-based application
Date Wed, 11 Mar 2015 00:25:47 GMT
> On 8 Mar 2015, at 10:20 pm, Franz de Copenhague <>
> I agree that HTML5 is a good model to feed into the editing library to support the edition
of paragraphs, lists, text, tables, images. But what about sections, headers, footers, fields
(author, date, etc ), styles and themes? All of them are document features implemented either
docx or odt and so far they are not supported by DocFormat API. 

So this is where things get tricky :)

HTML5 does not directly support all the features of OOXML/ODF word processing documents, so
we need to figure out whether or not we are going to support these features and, if so, how.
With UX Write I’ve always taken the position that it was never intended to be a complete
replacement for Word/OO and that it was subject to the inherent limitations of HTML (e.g.
no page breaks, tabs, headers/footers etc). But I got a *lot* of complains about the lack
of those features, which meant a difficult situations as those can only be properly be added
(at least in terms of doing the layout calculations) by modifying the web layout engine itself
- although some of them can be “faked” using javascript. But I think we can find a way
to support most of these.

Sections: (This term is ambiguous unfortunately as it can both mean different parts of a document
e.g. “See Section 3.2 for details” and part of the document that has separate page layout
settings). We could support these using a <div> with a custom CSS class, e.g. “corinthia-section”,
which means that a browser or any other HTML-supporting program will still be able to make
sense of the document, only that we will know that class=“corinthia-section” has special
semantics that we handle appropriately in both DocFormats and the editor.

There are actually a few instances already where I’ve used custom class names for this purpose
- see DocFormats/core/src/common/DFClassNames.h. Currently these use the “uxwrite-“ prefix,
which should be changed to “corinthia-“ - this is a fairly easy task for someone to take
on if perhaps if they want to start making a contribution since it’s largely just find and
replace. When the change occurs we must also update the tests.

For sections, we could alternatively use the <article> tag which is also in HTML5, and
thinking about it I’d actually favour this more than a div since then we can avoid relying
on a custom class name. There is a <section> element also but this is for sections in
the “see section 3.2” sense (i.e. what appears in the table of contents of a report).

Headers and footers: HTML5 actually has <header> and <footer> elements - but,
bizarelly, they don’t seem to be intended for the same process as the way we think of them
in traditional word processing. However just checking the spec now it seems they’ve made
it a little more clearer. Even if browsers won’t necessarily display them properly as such,
due to the non-paginated layout model used on the web, it’s at least the closest we can
get in terms of how we represent things. We may be able to have the editor use CSS tricks
to display the header and footer content at the top and bottom of the screen.

Fields: There’s a few of these that are handled already, though the set is fairly limited.
These are:

- Table of contents
- List of figures
- List of tables
- Cross-reference (to a section, figure or table) - can be text only, label + number, caption
text, etc.

See DFClassNames.h for the list of these, and also grep through the JS files in Editor/src
and the OOXML filter to see how they’re used. I think using custom CSS class names to identify
them, and perhaps data- attributes where we need extra information would be appropriate.

Incidentally, once nice thing about how these are handled in the Editor is it updates them
automatically, in the same way that a spreadsheet automatically recalculates formulas. Every
time you add, remove, or rename a section (<h1> to <h6>), figure, or table (in
the case of the latter two, changing content of the caption), the table of contents and all
cross-references are updated. This is handled in Outline.js. This also reports changes to
the outline structure of these items to callback functions, so the editor can display a “document
map” or outline view in the UI.

Styles: Already handled, via CSS. See for example DocFormats/filters/ooxml/src/word/WordStyles.c
which is where the translation is done for OOXML Word documents.

For the Editor, the JS code there’s no facilities for manipulating styles directly other
than simply getting and settings the CSS text. DocFormats provides a set of classes for representing
CSS stylesheets, styles, and property collections, which can be used in native (C/C++/Objective
C) code. UX Write uses this API, and the Qt editor can do the same. For the web-based version
of an editor, we’ll need to create a similar set of data structures for the Web UI.

Themes: I’m not sure what the best strategy for this is, but I’d say something along the
lines of CSS stylesheets that can be reused among different documents would probably be the
way to go. This requires a lot of thought and investigation.

Dr Peter M. Kelly

PGP key: <>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message