openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jan i <>
Subject Re: OOXML
Date Sun, 03 Aug 2014 17:16:57 GMT
On 3 August 2014 18:50, Peter Kelly <> wrote:

> On 3 Aug 2014, at 6:52 pm, Regina Henschel <>
> wrote:
> Peter Kelly schrieb:
> There's two ways to view a format: (1) as a way of encoding information
> for storage or transmission, and (2) as an in-memory data structure used
> by the editor at runtime. In some programs these are two different
> things, and in others they are the same. The latter is true of web
> browsers - HTML is both the file format and the runtime data model; the
> W3C DOM APIs can be used to manipulate the HTML structure directly. I
> believe this was also true to a large extent with the binary formats
> used by older versions of MS Office, for purposes of efficiency [1].
> I'm not familiar with the internals of OpenOffice - one thing I'd be
> very interested to know is does it use ODF for it's in-memory
> representation of the document? Or are the runtime data structures used
> different to the XML trees that one finds in an ODF package?
> No, OpenOffice has a very different in-memory representation than the ODF
> format. And the API is a third version of looking at the document.
> Interesting.
> Given this is the case, what would you suggest would be the best strategy
> for supporting OOXML?
> 1) Two-way conversion between OOXML and ODF, with OpenOffice then dealing
> solely with the file as ODF (not even being aware it came from OOXML
> originally)
> 2) Two-way conversion between OOXML and OpenOffice's internal
> representation, bypassing ODF altogether
> The second option has the advantage that it would be easier to cater for
> features that are supported in OOXML but not ODF, e.g. table styles.
> However the first option has the advantage that it would keep the core
> entirely separate from the OOXML filter, and could potentially be
> constructed as in a general-purpose manner and made usable as a library by
> other software.

By painfull experience, I found out that our internal (memory) structure is
a superset of mixed ODF and pre-odf items. I dont think you can have a pure
odf/OOXML memory structure, you need internal pointers as well (like
start/finish of copy buffer)...but of course those 2 parts should have been
well separated.

I wonder, you wrote earlier that UXwrite uses html internally, that seems
for me as the lowest common nominator...I would have thought a real
superset would have been the better choise ?

Some parts of AOO uses the structure directly others go through the API,
that is not very clean, and makes it extremly difficult to test chaanges in
the internal memory layout. An application like this (and many other
similar types), should see the memory as a capsule, with a fixed API around

jan I

> --
> Dr. Peter M. Kelly
> Founder, UX Productivity
> PGP key:
> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message