Return-Path: X-Original-To: apmail-openoffice-dev-archive@www.apache.org Delivered-To: apmail-openoffice-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 35CFA116F4 for ; Sun, 3 Aug 2014 17:17:00 +0000 (UTC) Received: (qmail 91670 invoked by uid 500); 3 Aug 2014 17:17:00 -0000 Delivered-To: apmail-openoffice-dev-archive@openoffice.apache.org Received: (qmail 91572 invoked by uid 500); 3 Aug 2014 17:16:59 -0000 Mailing-List: contact dev-help@openoffice.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@openoffice.apache.org Delivered-To: mailing list dev@openoffice.apache.org Received: (qmail 91561 invoked by uid 99); 3 Aug 2014 17:16:59 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 Aug 2014 17:16:59 +0000 Received: from localhost (HELO mail-lb0-f181.google.com) (127.0.0.1) (smtp-auth username jani, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 Aug 2014 17:16:59 +0000 Received: by mail-lb0-f181.google.com with SMTP id 10so4523324lbg.26 for ; Sun, 03 Aug 2014 10:16:57 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.152.243.43 with SMTP id wv11mr8441417lac.52.1407086217503; Sun, 03 Aug 2014 10:16:57 -0700 (PDT) Received: by 10.112.204.40 with HTTP; Sun, 3 Aug 2014 10:16:57 -0700 (PDT) In-Reply-To: <0A58A1BF-20D4-4BFB-A42F-22D717085DB1@gmail.com> References: <20140801084237.fd8a9b8ca6261554f2f0b9b0@iol.ie> <22A06F0C-16C5-4640-9C75-9CE8CE66AC17@gmail.com> <7F2D7236-9B6D-4F31-B4BF-84934EDFBC98@gmail.com> <002f01cfae7f$5ce0a6e0$16a1f4a0$@acm.org> <53DE2273.8030503@t-online.de> <0A58A1BF-20D4-4BFB-A42F-22D717085DB1@gmail.com> Date: Sun, 3 Aug 2014 19:16:57 +0200 Message-ID: Subject: Re: OOXML From: jan i To: dev Content-Type: multipart/alternative; boundary=001a113433ae29ab8504ffbcca1b --001a113433ae29ab8504ffbcca1b Content-Type: text/plain; charset=UTF-8 On 3 August 2014 18:50, Peter Kelly wrote: > On 3 Aug 2014, at 6:52 pm, Regina Henschel > wrote: > > Peter Kelly schrieb: > > There's two ways to view a format: (1) as a way of encoding information > for storage or transmission, and (2) as an in-memory data structure used > by the editor at runtime. In some programs these are two different > things, and in others they are the same. The latter is true of web > browsers - HTML is both the file format and the runtime data model; the > W3C DOM APIs can be used to manipulate the HTML structure directly. I > believe this was also true to a large extent with the binary formats > used by older versions of MS Office, for purposes of efficiency [1]. > > I'm not familiar with the internals of OpenOffice - one thing I'd be > very interested to know is does it use ODF for it's in-memory > representation of the document? Or are the runtime data structures used > different to the XML trees that one finds in an ODF package? > > > No, OpenOffice has a very different in-memory representation than the ODF > format. And the API is a third version of looking at the document. > > > Interesting. > > Given this is the case, what would you suggest would be the best strategy > for supporting OOXML? > > 1) Two-way conversion between OOXML and ODF, with OpenOffice then dealing > solely with the file as ODF (not even being aware it came from OOXML > originally) > 2) Two-way conversion between OOXML and OpenOffice's internal > representation, bypassing ODF altogether > > The second option has the advantage that it would be easier to cater for > features that are supported in OOXML but not ODF, e.g. table styles. > However the first option has the advantage that it would keep the core > entirely separate from the OOXML filter, and could potentially be > constructed as in a general-purpose manner and made usable as a library by > other software. > By painfull experience, I found out that our internal (memory) structure is a superset of mixed ODF and pre-odf items. I dont think you can have a pure odf/OOXML memory structure, you need internal pointers as well (like start/finish of copy buffer)...but of course those 2 parts should have been well separated. I wonder, you wrote earlier that UXwrite uses html internally, that seems for me as the lowest common nominator...I would have thought a real superset would have been the better choise ? Some parts of AOO uses the structure directly others go through the API, that is not very clean, and makes it extremly difficult to test chaanges in the internal memory layout. An application like this (and many other similar types), should see the memory as a capsule, with a fixed API around it. rgds jan I > > -- > Dr. Peter M. Kelly > Founder, UX Productivity > peter@uxproductivity.com > http://www.uxproductivity.com/ > http://www.kellypmk.net/ > > PGP key: http://www.kellypmk.net/pgp-key > (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966) > > --001a113433ae29ab8504ffbcca1b--